Meaning of the terms O(1) space and without using extra space
According to Fortnow & Homer (2003),
The space complexity of the computation is [...] taken to be the amount of space used on the work tapes.
Sorting algorithm are all O(n) space in the least, since it needs the space to store all the inputs (no matter on memory or on disk). Therefore, even for bubble sort, the space complexity is still O(n).
However, sometimes, we are not interested in the overall space complexity (esp. in the case above), but we want to know the additional space used by the algorithm. For bubble sort, we can say that it uses constant amount of additional space.
Recursion is quite a special case where we have to consider stack. We are storing the state when we recurse, and we call the recursive function many times based on the input. As the number of recursion level depends on the input size, the space complexity must take into consideration the stack space usage.
I'm not sure if O(1) space algorithm is common or not, but Cycle Finding algorithm is one of such example. The algorithm by itself only use space for exactly 2 "pointers". Extra spaces used by the function whose cycle to be find should be counted separately.
In case of counting sort, the space complexity depends on the size of the input n (the count) and the maximum input value k. The 2 parameters are independent of each other, hence the space complexity is O(n + k). The additional space used can be defined as O(k).
"No extra space" implies some amount of space, usually exactly n, is available via the input, and no more should be used, although in an interview I never care if the candidate uses O(1) extra. Honestly you would be hard-pressed in any modern language to avoid O(1) extra space for almost any trivial action you could take.
The stack counts when giving bounds on algorithms' space complexity.
O(1) means constant.
Counting sort uses at minimum O(k) space, where k is the largest possible key magnitude. Therefore, theoretically if we are talking about integers on a fixed number of bits, that is a constant. That is also why a radix sort is sometimes said to be a linear time sort.