Meaning of A-infinity relations
For your first question. Suppose $(A,d,m,m_3,m_4\dots)$ is an $A_\infty$-algebra. The operation $m_3$ gives a homotopy between $m(-,m(-,-))$ and $m(m(-,-),-)$, which I will abusively denote as $a(bc)$ and $(ab)c$.
Now consider the two operations $A^{\otimes 4} \to A$ given by $a(b(cd))$ and $((ab)c)d$. Using $m_3$, you have two ways of going from the first to the second:
- Use $m_3$ three times to create a homotopy: $$a(b(cd)) \to a((bc)d) \to (a(bc))d \to ((ab)c)d.$$
- Use $m_3$ twice to create a homotopy: $$a(b(cd)) \to (ab)(cd) \to ((ab)c)d.$$
If you combine these two homotopies, you get two classes of degree $2$ maps $A^{\otimes 4} \to A$. These two maps have no reason to be homotopic. Well, the higher operation $m_4$ gives a homotopy between these two homotopies!
The even higher operations $m_5$, $m_6$ and so on work the same way. This is very nicely encoded in Stasheff's associahedra. The first two associahedra is just a point, representing the identity and $m_2$; the next one is a segment, representing $m_3$, a homotopy between $a(bc)$ and $(ab)c$; the next one is a pentagon, whose edges are the five arrows I drew above; and so on.
For your second question, the answer is Massey products. Very briefly, suppose that you have a differential graded algebra $A$ and three cycles $a,b,c$ such that $ab = d\alpha$ and $bc = d\beta$. Then the class $abc$ vanishes "in two different ways", because $abc = d(\alpha c) = d(a \beta)$. It follows that $\alpha c - a \beta$ is a homology class, called the triple Massey product $\langle a,b,c \rangle$. The operation $m_3$ can be used to represent this triple Massey product on homology. It's a bit technical to explain how, and the explanation involves the Homotopy Transfer Theorem.
For both answers, I think a good reference that cites pretty much all the other possible ones is the book Algebraic Operads by Loday and Vallette.
I'll be informal but try to give a topologist's intuitive interpretation, which gives the original source of the idea. Historically, $A_{\infty}$ structures start with Stasheff's work determining what higher coherence homotopies are needed to ensure that an $H$-space X (a space with a product with unit) has a classifying space, given by a bar construction. This was later interpreted as meaning that $X$ has an action by the Stasheff operad in spaces, in fact CW complexes. The chain complex associated to that operad gives an operad in chain complexes and thus a notion of $A_{\infty}$ algebra in chain complexes, with its associated bar construction. Going from there to DG categories is not a big leap. Thus the idea is that the specifics of the definition are encoding relevant higher homotopies.
One interpretation I prefer is obtained by defining $A_{\infty}$ structures on graded space $L$ as a degree $-1$ square zero derivation $d$ on (co)free coalgebra on $L[-1]$ aka its bar construction. As it is cofree, derivation is determined by projection on cogenerators, so you have a bunch of maps $L^{\otimes k} \to L$ of degree $2-k$. Now all relations become consequences of the fact that $d^2 = 0$ on $T_*(L[-1])$ and can be obtained by collecting elements of same degree. Other way around, beginning with a ($A_{\infty}$-)coalgebra one can construct free algebra with a square zero derivation on it encoding comultiplication (and higher operations). Composing these two together, we obtain a strictly associative dg algebra which is an "unfolded model" of $\infty$-algebra.
So the way I interpret higher operations is that they are just cramped up ordinary associative multiplication when underlying space is too small.
Main upshot is that this kind of algebraic structure can be transferred along vector space retractions, and as every chain complex can be retracted onto its homology, it feels like a "right one" for doing algebra in chain complexes.
Good refences are "Algebraic operads" by J.-L. Loday and B. Valette, "Koszul duality for operads" by Ginzburg and Kapranov and "Modules over Operads and Functors" by Fresse.