Count The Genus
APL (Dyalog Extended), 73 bytes
1-1⊥∘(∊∨/¨⍮2(-⊃⍤∧∘⌽/⍮(3⊃∧∘⌽)⌿)⊢)(⊂4⍴2)⊤¨(9467+⎕AV⍳'ð{dòlxõ _0h8p∆')⍳⎕UCS
Try it online!
Over half of the bytes were used in converting the char matrix into an easier-to-use format. Basically, converts each box-drawing char into [right, up, down, left]
, counts non-blanks and joints, and computes 1 - non-blanks + joints
.
This works because of Euler characteristic V - E + F = 1
(not counting the surrounding area as a face), thanks to the assumption that the entire figure is connected. If we count any segment starting from the center to some boundary as an edge (and the respective endpoints as vertices), every non-blank char has V - E = 1
, but the V
in the entire shape is reduced by the number of joints. Plugging into the rearranged equation F = 1 - V + E
gives the above formula 1 - non-blanks + joints
.
Retina, 131 bytes
T`─┐┬╴┼┤┴┘│╵└├┌╶╷`L
(?=(.)*[BCEFILMO].*¶(?>(?<-1>.)*)[E-L])
-
(?=[ACEGK-N]-?[A-H])|^
-
+`-\s*\w
C`-
Try it online!
TIO counts the box drawing characters as only one byte even though they are not ASCII (the Retina code page).
How?
This uses a rearranged Euler characterstic F=1+E-V
to find the number of faces under the assumption that there is only one connected component.
The number of vertices, V
, is the same as the number of box characters.
There is +1 edge for every horizontally adjacent pair of box characters where the first has a right edge and the second has a left edge. There is +1 edge for every vertically adjacent pair of box characters where the first has a down edge and the second has a up edge. This is harder to express in regex, so I borrowed the method of standard vertical matching to deal with vertical edges.
# Translate each of the box characters into letters to avoid repeating them
# We do this smartly:
# ─┐┬╴┼┤┴┘ correspond to A-H and all have a left edge
# ┼┤┴┘│╵└├ correspond to E-L and all have an up edge
# └├┌╶ correspond to K-N and all have a right edge
T`─┐┬╴┼┤┴┘│╵└├┌╶╷`L
# Prepend a '-' to the first character of a vertically adjacent pair with (down edge, up edge)
# Each '-' will represent one edge
(?=(.)*[BCEFILMO].*¶(?>(?<-1>.)*)[E-L])
-
# Prepend a '-' to the first character of a horizontally adjacent pair with (left edge, right edge)
# Simply pass over '-' from the previous step using -?
# |^: also prepend an extra '-', which counts as an extra edge, to the beginning of the whole string for the +1
(?=[ACEGK-N]-?[A-H])|^
-
# Now there is a - for each edge and a capital letter for each vertex, with some whitespace
# Compute #(edges) - #(capital letters)
# Until the output is constant:
# Replace an edge and a capital letter with the empty string (`\s*` to pass over whitespace)
+`-\s*\w
# Count the number of `-`s
C`-
Charcoal, 100 bytes
WS⊞υιB׳⊕Lθ׳⊕LυψFυ«⸿⸿⸿Fι«M³→F⁴¿&X²λ⍘§”E←T¹÷S1²m[ω→℅P”﹪℅κ⊕⊘φφP✳⊗λ²»»≔⁰ηFLθFLυ«J׳ι׳κ↘≧⁺¬℅KKη¤#»⎚I⊖η
Try it online! Link is to verbose version of code. Takes input as a newline-terminated rectangular character array. While I could have written a version that uses the Euler characteristic, I felt that I should really write a version using all of Charcoal's power to calculate the number of regions in any shape. Explanation:
WS⊞υι
Input the shape.
B׳⊕Lθ׳⊕Lυψ
Draw an empty box large enough to completely enclose the shape at three times scale.
Fυ«⸿⸿⸿Fι«M³→
Loop over each character of the input, positioning the cursor at the appropriate position.
F⁴¿&X²λ⍘§”E←T¹÷S1²m[ω→℅P”﹪℅κ⊕⊘φφP✳⊗λ²
Loop over each direction, looking up the box lines based on taking the ordinal of the character modulo 501 and 23 (the latter automatically via Charcoal's cyclic indexing) in a table of hex digits (and g
s for filler), drawing a line for each matching bit in the digit, thus drawing the original shape at triple size.
»»≔⁰η
Start with no regions.
FLθFLυ«
Loop over each character.
J׳ι׳κ↘
Jump to the upper right of the character.
≧⁺¬℅KKη
Increment the region count if it's empty.
¤#
And then try to fill it anyway.
»⎚I⊖η
Print one less then the resulting count (since this procedure counts the outside too).