Is a*b* regular?

Answer to your question:

imagine a*b* to be regular, Is there a proof for it being regular or not?

No need to imagine, expression a*b* is called regular expression (re), and regular expressions are possible only for regular languages. If a language is not regular then regular expression is also not possible for that and if a language is regular language then we can always represent it by some regular expression.

Yes, a*b* represents a regular language.

Language description: Any number of a followed by any numbers of b (by any number I mean zero (including null ^) or more times). Some example strings are:

{^, a, b, aab, abbb, aabbb, ...}

DFA for RE a*b* will be as follows:

      a-            b-
      ||            ||
      ▼|            ▼|
---►((Q0))---b---►((Q1))

In figure: `(())` means final state, so both `{Q0, Q1}` are final states.

You need to understand following basic concept:

What is basically a regular language? And why an infinite language `a*b*` is regular whereas languages like `{ anbn | n > 0 }` are not regular!!

A language(a set) is called regular language, if it requires only bounded(finite) amount of information to keep store at any instance of time while processing strings of the language.

So, what is 'bounded' information?
For example: Consider a fan 'on'/'off' switch. By viewing fan switch we can say whether the fan is in the on or off state (this is bounded or limited information). But we cannot tell 'how many times' a fan has been switched to on or off in the past! (to memorize this, we require a mechanism to store an 'unbounded' amount of information to count — 'how many times' e.g. the meter used in our cars/bikes).

The language { anbn | n > 0 } is not a regular language because here n is unbounded(it can be infinitely large). To verify strings in the language anbn, we need to memorize how many a symbols there are and it requires an infinite memory storage to count because the number of a symbols in the string can be infinitely large!

That means an automata is only capable of processing strings of the language anbn if it has infinite memory e.g PDA.

Whereas, a*b* is of course regular by its nature, because there is the bounded restriction ‐ that b may come after some a ( and a can't came after b). And that is why every string of this language can be easily processed (or recognized) by an automata in which we have finite memory - and finite automata is a class of automata where memory is finite. Yes, in finite automata, we have finite amount of memory in the term of states.

(Memory in finite automata is present in the form of states Q and according to automata principal: any automata can have only finite states. hence finite automata have finite memory, this is the reason the class of automata for regular languages is called finite automata. You can think of a finite automata like a CPU without memory, that has finite register to remember its internal states)

Finite State ⇒ Finite Memory ⇒ Only language can be processed for which finite memory needs to store at any instance of time while processing the string ⇒ that language is called Regular Language

Absent of external memory is limitation of finite automate ⇒ or we can say limitation of finite automata defined class of language called Regular Language.

You should read other answer "finiteness of regular language" to learn scope of regular language.

side note::

  • language { anbn | n > 0 } is subset of a*b*
  • Also a language { anbn | 10>100 n > 0 } is regular, a large set but regular because n is bounded, hence finite automata and regular expression is possible for this language.

You should also read: How to prove a language is regular?


The proof is: ((a*)(b*)) is a well-formed regular expression, hence matching a regular language. a*b* is a syntactic shortenning of the same expression.

Another proof: Regular languages are closed to concatenation. a* is a regular language. b* is a regular language, therefore their concatenation, a*b*, is also a regular expression.

You can build an automat for it:

0 ->(a) 1
0 ->(b) 2
1 ->(a) 1
1 ->(b) 2
2 ->(b) 2
2 ->(a) 3
3 ->(a,b) 3

where only 3 is not an accepting state, and prove that the language is a*b*.