How to efficiently represent and manipulate polynomials in software?
Note: this answer turned out a bit TLDR; if you just want a direct suggestion (instead of a survey), skip to the last paragraph.
Since the typical use case has now been made clear (in the last comment), I suggest you should go over this 2007 presentation by M. Gastineau, especially slides 8-15. It summarizes the available representation options and also what's used by several CAS. You should then skim over the sequel of that presentation, especially slides 28-30, assuming multiplication is a common use case for you; even if that's not really the case, it is worth looking at this presentation too because it mentions what's used by several other CAS packages as representation that are somehow not mentioned in the first presentation.
Putting these together, there's a fair bit of info, e.g. GIAC and MAGMA use hashmaps, GINAC and Mathematica 5 use trees, Maple 10 uses DAGs, Maxima uses a recursive list, Singular uses monomial sorted lists (which work surprisingly well) etc. [I guess there's no implementation info on newer versions of commercial products.] I suspect some of the more obscure representations, like those used by TRIP (which is Gastineau's own CAS kernel) despite being fast, may be hard to code... Pari/GP appears fast too based on the opening slide of the first presentation, but it doesn't say what representation it uses. Looking at its FAQ it appears that multivariate polynomials are faked using univariate ones, so perhaps that's why the info wasn't there... and I'm guessing that that trick may not be satisfactory in your application. (SAGE which has been discussed by MvG in his answer also uses hashmaps [Python dicts].)
I've looked at Singular's sources a little bit because judging from its idea the implementation promised to be relatively simple. Alas the source code of Singular is rather messy and unevenly commented. One thing I could gather though is that they use bucket sort to keep their monomial lists in order.
Since the benchmarks in those slides are ~8 years old, I would not take them at face value for current version of the systems mentioned. The bottom line is that there are many representation options and you need to experiment to find out what works best for you. If you're implementing this in high-level language like Python, some of the low-level optimizations which rely on structure layout to achieve cache locality might not work unless you use something like NumPy arrays.
There's a more recent (2010) paper on the (new) implementation of "sparse" multivariate polynomials in Maple 14 by M. Monagan and R. Pearce. Interestingly it revives an old idea of packing multiple exponents into a word; besides that, it is just packed, lexicographically sorted monomial array, so the implementation is rather simple. And it claims to have beaten TRIP. The paper also benchmarks the other usual suspects but in newer versions (e.g. Mathematica 7) and on more operations than just multiplication. The reason why I put "sparse" in quotes is that the polynomials used in this paper aren't all that sparse; they are in fact similar to what was used in the 2007 TRIP presentation which doesn't call such polynomials sparse, e.g. this paper is using $(1+x+y+z)^{20}+1$ which has ~1700 terms. So I would suggest using this Maple 14 representation as it seems to be close (if not the) current "state of the art" as far as performance goes and it is reasonably simple to implement.
As Alex wrote, I'd stop thinking about multiple variables with multiple components. If you have $x\in\mathbb R^3$ and $y\in\mathbb R^2$, then you have five scalar variables $x_1,x_2,x_3,y_1,y_2$ or equivalently one five-dimensional variable vector. You might want to keep track of the original grouping somewhere, and use that information when constructing polynomials or presenting or interpreting the results, but for the bulk of the operation, this information should simply be carried along, with no impact on how the operations go.
One caveat here is that you should ensure that this information is consistent. If you have one polynomial in $x_1,x_2,x_3$ and another polynomial in $y_1,y_2$ and want to multiply these, you might want to first convert both to a single five-dimensional representation. By declaring all variables you are going to use up front, you can avoid performing such conversions repeatedly during the course of your computation. Sage, PARI and probably others as well require you to declare a polynomial ring, explicitely naming all indeterminates, before you can construct polynomials.
In my experience, multivariate polynomials in practice tend to be pretty sparse. This means that if you fix the total degree, only a tiny fraction of the possible monomials which might fit into that total degree are actually present with non-zero coefficient. Therefore I wouldn't use a vector which stores all coefficients. Instead I'd use a sparse representation, where you essentially have a map from a tuple of powers (i.e. $x_1^2x_3x_4^3$ would be represented as $(2,0,1,3)$) to the corresponding non-zero coefficients.
I know that at least one implementation inside Sage goes one step further, and makes even that exponent tuple sparse, so that variables you don't actually use in your polynomial won't be wasting any space.
Formally speaking, a map from coefficients for $n$ variables to real coefficients would be a function $f:\mathbb N^n\to\mathbb R$. Power vectors which are not explicitely stored would result in $f$ being zero for these. That would correspond to a polynomial in the following way:
$$ p(X_1,\dots,X_n) = \sum_{v\in\mathbb N^n}\left(f(v)\cdot\prod_{i=1}^nX_i^{v_i}\right)$$
It might be beneficial if your map can iterate over all its entries in term order (e.g. first by degree, then reverse lexicographically), instead of arbitrary order. A red-black tree as the backend of the implementation might be better than a hash map, at least for some operations. So how do your basic operations work in this setup?
- Comparison: Iterate over keys in order, in both polynomials simultaneously. Once you have a key difference, or a coefficient difference for the same key, you know the order.
- Addition: Make a copy of one argument (unless you know you can modify the input), then iterate over the other argument. For each key during that iteration, check whether that key is already present in the sum. If it is, add the coefficients, otherwise add a new key-value pair to the result. You might want to make sure you drop coefficients which become zero.
- Multiplication: Use two nested loops, iterating over all pairings, and add resulting coefficients as I described for addition.
Unless your polynomials are always dense (in which case a tensor would be appropriate but huge), I see two options:
keep the list of variables, and for every term keep the coefficient and the list of exponents. For $v$ variables and $t$ terms, you'll store a list of $v$ identifiers, a list of $t$ coefficients and a $t\times v$ array of exponents.
keep the list of terms, each represented by a coefficient and a list of variable identifiers with an exponent. For an average of $f$ factors per term, you'll store $t$ coefficients, $t\times f$ identifiers and $t\times f$ exponents. (Essentially, you are storing the raw polynomial expression.)
The storage costs are $vI+tC+tvE$ vs. $tfI+tC+tfE$. The access costs depend on the access patterns.
In both cases, it is probably advisable to keep the terms in a normalized order (variables in sorted order and terms in lexicographical order, $a^2b^3=aabbb$ after $ab^5=abbbbb$).