Field symbol and data reference concept in ABAP
A field symbol, which has been around in ABAP much longer, allows you to manipulate and pass values of fields at runtime, without knowing the name of the field beforehand. Consider this use case: You have a structure with 20 fields, you can reference each field by name and assign it to a field symbol, and then change the value of a particular field etc.
A data reference (TYPE REF TO DATA
), which is a relatively newer addition to ABAP, allows you to instantiate data at runtime without knowing the type beforehand using the 'CREATE DATA' statement.
For an example of the use of CREATE DATA
, see the following SAP Help page. It shows you how you can for example get a reference to a reference object (i.e. ABAP Objects reference) using CREATE DATA
, which is something you could not do with a field symbol: http://help.sap.com/abapdocu_70/en/ABAPCREATE_DATA_REFERENCE.htm
Although data references and field symbols look very similar and are often used in a similar fashion (see the other answers), they are fundamentally different.
Data references are variables that store a value, just like a string or an integer. They have a fixed size in memory and a content. The only difference is that these references are pointers to other data objects, i. e. the content has a special meaning. They can point nowhere, they can be dereferenced, you can pass them along to other routines, you can manipulate either the pointer (GET REFERENCE
) or the value it points to. Nothing special to it, really - just pointers as you know them from your favorite programming language.
Field Symbols are no "real" variables. The documentation states that
They do not physically reserve space for a field
Field Symbols are really only clever manipulations of the local symbol table of the ABAP VM. I'll try to illustrate this - note that this is a heavily simplified model. Let's say you declare three variables:
DATA: my_char TYPE c,
my_int TYPE i,
my_ref TYPE REF TO i.
Then the symbol table will contain - among others - entries that might look like this:
name type size addr
------------------------------
MY_CHAR c 1 0x123456
MY_INT i 4 0x123457
MY_REF r ? 0x123461
(I'm not sure about the actual size of a reference variable.)
These entries only point to an address that contains the values. Depending on the scope of these variables, they might reside in totally different memory areas, but that's not our concern at the moment. The important points are:
- Memory has to be reserved for the variables (this is done automatically, even for references).
- References work just like all the other variables.
Let's add a field symbol to this:
FIELD-SYMBOLS: <my_fs> TYPE any.
Then the symbol might look like this:
name type size addr target
--------------------------------------
MY_CHAR c 1 0x123456
MY_INT i 4 0x123457
MY_REF r ? 0x123461
<MY_FS> *
The field symbol is created in its initial state (unassigned). It doesn't point anywhere, and using it in this state will result in a short dump. The important point is: It is not backed by "heap" memory like the other variables. Let's
ASSIGN my_char TO <my_fs>.
Again the symbol might look like this:
name type size addr target
--------------------------------------
MY_CHAR c 1 0x123456
MY_INT i 4 0x123457
MY_REF r ? 0x123461
<MY_FS> * MY_CHAR
Now, when accessing <my_fs>
, the runtime system will recognize it as a field symbol, lookup the current target in the symbol table and redirect all operations to the actual location of my_char
. If, on the other hand, you'd issue the command
GET REFERENCE OF my_int INTO my_ref.
the symbol table would not change, but at the "heap address" 0x123461, you'd find the "address" 0x123457. Just a value assignment like my_char = 'X'
or my_int = 42 * 2
.
This is, in a very simplified version, the reason why you cannot pass field symbols as changing parameters and allow them to be reassigned inside the subroutine. They do not exist in the same way that other variables do, and they have no meaning outside of the scope of the symbol table they were added to.
The field-symbol is much like a pointer, but one that you can only access in a dereferenced form. In other words, it will hold, internally, the memory address of the variable that was assigned to it, but it will not allow you to see the memory address, only the data that is stored in the variable that it points to. This can be proved, because if you change the contents of a field-symbol that points to an internal table line, you'll see that the changes will be made directly in the line.
A data reference acts like a simple pointer, except that you can't increment or decrement the memory address like in C (ptr++, ptr-- and such). It differs from a field-symbol because you can compare two data references to check if they point to the exact same spot in the memory. Comparing two field-symbols will be a simple value comparison. Another difference is that you can allocate memory dynamically by creating data references, with the CREATE DATA command. A field-symbol can only be assigned to an already allocated variable.