Is there a simplistic way to extract numbers from a string following certain rules?

Your rules are rather complex, so you can try to build finite state machine (FSM, DFA -Deterministic finite automaton).

Every char causes transition between states.

For example, when you are in state "integer started" and meet space char, you yield integer value and FSM goes into state " anything wanted".

If you are in state "integer started" and meet '.', FSM goes into state "float or integer list started" and so on.


The answer is quite close, but there are several basic errors. To give you some hints (without writing your code for you): Within the while loop you MUST ALWAYS increment (the increment should not be where it is otherwise you get an infinite loop) and you MUST check that you have not reached the end of the string (otherwise you get an exception) and finally your while loop should not be dependant on CH1, because that never changes (again resulting in an infinite loop). But my best advice here is trace through you code with the debugger - that is what it is there for. Then your mistakes would become obvious.


You have got answers and comments that suggest using a state machine, and I support that fully. From the code you show in Edit1, I see that you still did not implement a state machine. From the comments I guess you don't know how to do that, so to push you in that direction here's one approach:

Define the states you need to work with:

type
  TReadState = (ReadingIdle, ReadingText, ReadingInt, ReadingFloat);
  // ReadingIdle, initial state or if no other state applies
  // ReadingText, needed to deal with strings that includes digits (P7..)
  // ReadingInt, state that collects the characters that form an integer
  // ReadingFloat, state that collects characters that form a float

Then define the skeleton of your statemachine. To keep it as easy as possible I chose to use a straight forward procedural approach, with one main procedure and four subprocedures, one for each state.

procedure ParseString(const s: string; strings: TStrings);
var
  ix: integer;
  ch: Char;
  len: integer;
  str,           // to collect characters which form a value
  res: string;   // holds a final value if not empty
  State: TReadState;

  // subprocedures, one for each state
  procedure DoReadingIdle(ch: char; var str, res: string);
  procedure DoReadingText(ch: char; var str, res: string);
  procedure DoReadingInt(ch: char; var str, res: string);
  procedure DoReadingFloat(ch: char; var str, res: string);

begin
  State := ReadingIdle;
  len := Length(s);
  res := '';
  str := '';
  ix := 1;
  repeat
    ch := s[ix];
    case State of
      ReadingIdle:  DoReadingIdle(ch, str, res);
      ReadingText:  DoReadingText(ch, str, res);
      ReadingInt:   DoReadingInt(ch, str, res);
      ReadingFloat: DoReadingFloat(ch, str, res);
    end;
    if res <> '' then
    begin
      strings.Add(res);
      res := '';
    end;
    inc(ix);
  until ix > len;
  // if State is either ReadingInt or ReadingFloat, the input string
  // ended with a digit as final character of an integer, resp. float,
  // and we have a pending value to add to the list
  case State of
    ReadingInt: strings.Add(str + ' (integer)');
    ReadingFloat: strings.Add(str + ' (float)');
  end;
end;

That is the skeleton. The main logic is in the four state procedures.

  procedure DoReadingIdle(ch: char; var str, res: string);
  begin
    case ch of
      '0'..'9': begin
        str := ch;
        State := ReadingInt;
      end;
      ' ','.': begin
        str := '';
        // no state change
      end
      else begin
        str := ch;
        State := ReadingText;
      end;
    end;
  end;

  procedure DoReadingText(ch: char; var str, res: string);
  begin
    case ch of
      ' ','.': begin  // terminates ReadingText state
        str := '';
        State := ReadingIdle;
      end
      else begin
        str := str + ch;
        // no state change
      end;
    end;
  end;

  procedure DoReadingInt(ch: char; var str, res: string);
  begin
    case ch of
      '0'..'9': begin
        str := str + ch;
      end;
      '.': begin  // ok, seems we are reading a float
        str := str + ch;
        State := ReadingFloat;  // change state
      end;
      ' ',',': begin // end of int reading, set res
        res := str + ' (integer)';
        str := '';
        State := ReadingIdle;
      end;
    end;
  end;

  procedure DoReadingFloat(ch: char; var str, res: string);
  begin
    case ch of
      '0'..'9': begin
        str := str + ch;
      end;
      ' ','.',',': begin  // end of float reading, set res
        res := str + ' (float)';
        str := '';
        State := ReadingIdle;
      end;
    end;
  end;

The state procedures should be self explaining. But just ask if something is unclear.

Both your test strings result in the values listed as you specified. One of your rules was a little bit ambiguous and my interpretation might be wrong.

numbers cannot be preceeded by a letter

The example you provided is "P7", and in your code you only checked the immediate previous character. But what if it would read "P71"? I interpreted it that "1" should be omitted just as the "7", even though the previous character of "1" is "7". This is the main reason for ReadingText state, which ends only on a space or period.