Day convolution intuition
Day convolution is a categorification of the monoid algebra construction. There is a formal analogy between the two, but one is not a literal generalisation of the other. So to address your question 3, we should not expect to recover the usual convolution from Day convolution.
Let's develop the following analogy:
\begin{array}{|c|c|} \hline \textbf{monoid algebra} & \textbf{Day convolution} \\ \hline \hline \text{set} & \text{category} \\ \hline \text{monoid} & \text{monoidal category} \\ \hline \text{ring } R & \text{monoidally cocomplete category } \mathcal{V} \\ \hline R\text{-module} & \text{cocomplete } \mathcal{V}\text{-category} \\ \hline R\text{-algebra} & \text{monoidally cocomplete } \mathcal{V}\text{-category} \\ \hline \text{free } R\text{-module on a set } X& \text{free cocomplete } \mathcal{V}\text{-category on a category } \mathcal{C}\\ R^{(X)}& [\mathcal{C}^\text{op},\mathcal{V}]\\ \hline \text{free } R\text{-algebra on a monoid } M & \text{free monoidally cocomplete } \mathcal{V}\text{-category}\\ & \text{on a monoidal category } \mathcal{A} \\ R^{(M)} \text{ with convolution product} & [\mathcal{A}^\text{op},\mathcal{V}] \text{ with Day convolution}\\ \hline \end{array}
Here a monoidally cocomplete category is a cocomplete monoidal category $\mathcal{V}$ such that $\otimes \colon \mathcal{V} \times \mathcal{V} \to \mathcal{V}$ is cocontinuous in each variable. This condition corresponds in our analogy to the distributivity of multiplication over addition in a ring.
Let $(e_x)_{x\in M}$ be the canonical basis for $R^{(M)}$, so that each element of $R^{(M)}$ can be written $f = \sum_{x} f(x) e_x$. The convolution product on $R^{(M)}$ is then determined by the requirement that $M \to R^{(M)}, x \mapsto e_x$ is a monoid homomorphism. For: \begin{equation} \begin{split} f \ast g & = \left(\sum_{x} f(x) e_x \right) \ast \left(\sum_{y} g(y) e_y \right) \\ & = \sum_{x,y} f(x)g(y) e_x \ast e_y \\ & = \sum_{x,y} f(x)g(y) e_{xy}. \end{split} \end{equation}
An analogous argument gives the formula for Day convolution. The representables $\mathcal{A}(-,A)$ provide a ''basis'' of $[\mathcal{A}^{op},\mathcal{V}]$: each object may be expressed as the canonical colimit $$F \cong \int^{A} FA \otimes \mathcal{A}(-,A).$$ The Day convolution is determined by the requirement that the Yoneda embedding $\mathcal{A} \to [\mathcal{A}^\text{op},\mathcal{V}]$ be strong monoidal. We have: \begin{equation} \begin{split} F \ast G & \cong \left(\int^{A} FA \otimes \mathcal{A}(-,A) \right) \ast \left(\int^{B} GB \otimes \mathcal{A}(-,B) \right) \\ & \cong \int^{A,B} F(A)\otimes G(B) \otimes \mathcal{A}(-,A) \ast \mathcal{A}(-,B) \\ & \cong \int^{A,B} F(A)\otimes G(B) \otimes \mathcal{A}(-,A\otimes B). \end{split} \end{equation} Note that we have used the requirement that the Day convolution product must preserve colimits in each variable.
Now, Day convolution can be defined for the more general case of a promonoidal category $\mathcal{A}$. Here we can continue our analogy and think of the promonoidal structure as providing the ''structure coefficients'' of the Day convolution product.
Hmm, a logical view of presheaves is as categorified predicates. If we choose the source category as discrete, then we can interpret the coend formula as a categorification of an existential quantification (if our presheaves only return {} or {$*$} then it will be exactly existential quantification.) A discrete monoidal category is a monoid. The coend formula then becomes: $$ \begin{align} P(x) &= \int^{(c,d)\in \mathcal{D}\times\mathcal{D}}F(c)\times G(d)\times\mathcal{D}(x,c\cdot d) \\ &= \sum(c,d):\mathcal{D}\times\mathcal{D}. F(c)\times G(d)\times (x = c\cdot d) \\ &= \exists (c,d) \in \mathcal{D}\times\mathcal{D}. F(c)\land G(d)\land (x = c\cdot d) \end{align}$$ Note that $\mathcal{D}(x,y)$ is empty except when $x = y$ in which case it is a singleton set containing only $id$. The second line is what the expression would look like in dependent type theory. The third line is what the expression would look like if we "enriched" in a partially ordered set. (Incidentially, the dependent type theory one actually does generalize to arbitrary $\infty$-groupoids and even arbitrary categories if we replace $=$ with a directed notion.)
This is probably not quite the answer you're looking for, but it might be a nice perspective to keep in your pocket. Connecting back to "standard" convolution, note that $$(f*g)(k) = \sum_{i+j=k}f(i)g(j)$$ can immediately be generalized to an arbitrary monoid for the $+$, an arbitrary commutative monoid for the $\Sigma$, and a completely arbitrary binary function for the multiplication. So there is a lot more generality in the normal convolution then generally appreciated. The continuous case isn't as easy to generalize, but then that's what the category theory is doing.
Actually, if we "enrich" in $\mathbb{R}^+$ then $\mathcal{D}$ becomes a metric space, $F$ and $G$ become $\mathbb{R^+}$-valued distance-decreasing functions, and the coend formula becomes: $$P(x) = \inf_{(c,d)} \{F(c)+G(d)+||x-c\cdot d||\}$$ (I'm not completely confident I didn't mess this one up, though I am confident some expression like this is right.)
Here's the example using open sets. The coend formula looks like: $$P(U) = \int^{(U_1,U_2)\in\mathcal{O}}F(U_1)\times G(U_2) \times (U \subseteq U1 \otimes U_2)$$
For concreteness let $F(U) = G(U) = X-U$ where $X$ is the complete space. Now, $$P(U) = \bigcup_{U\subseteq U_1\otimes U_2}(X-U_1)\times(X-U_2)$$ When $\otimes = \cap$ you can see directly that $P(U) = (X-U)\times(X-U)$ (consider when $X=\mathbb{R}^+$). (You can also show this through abstract nonsense which was my comment in reply to Zhen.) When $\otimes = \cup$ the notion isn't as clean. For the $\mathbb{R}^+$ example instead of getting a quarter plane with a square cut out of it, you get the quarter plane with a triangular corner cut out. $\cap$ gives you $P(z) = \{(x,y) | z \leq \min(x,y)\}$ while $\cup$ gives you $P(z) = \{(x,y) | z \leq x+y\}$.
I don't know if this can be of help, but what I found really useful to gain an intuition behind Day convolution is the correspondence between convolution products and promonoidal structures; the two things can be identified, as every convolution arises from a single promonoidal structure (this dates back to the work of Day himself, and I stated the result in my "coend-cofriend" note, Prop. A.3; I think it's an easy exercise).
The idea behind promonoidal structures is pretty easy to understand: it's what you get if you take the definition of monoidal category, and you replace every occurrence of the word "functor" with "profunctor" (or "bimodule", "distributor", it depends on how you want to call them).
Bye, Fosco