Why is the ideal sheaf $\mathcal{I}_Y$ equal to $\mathcal{O}(-Y)$ and not $\mathcal{O}(Y)$?
Your definition of $\mathcal O(D)$ seems to lacking precision. If $D$ is a single point $-p$, then $\mathcal O(D)$ is the sheaf of functions whose divisor is greater than $p$. Which means these functions must have a zero at $p$. If $D=p$, then $\mathcal O(D)$ is the sheaf of functions whose divisor is greater than $-p$, which means these functions can have a pole at $p$. So the point is signs matter. $\mathcal O(-Y)$ is correct.
Put $X = \mathbb P^1$.
$\mathcal O(-p)$ is defined to be the sheaf of holomorphic functions $f$ on $X$ such that the divisor of $f$ plus $-p$ is greater than $0$. This means sections of $\mathcal O(-p)$ must have a simple zero at $p$, and no poles elsewhere. It is clear that sections of $\mathcal O(-p)$ (which are holomorphic functions with the above condition) also are sections of $\mathcal O$ (whose sections are holomorphic functions, without any qualification).
$\mathcal O(p)$ is defined to be the sheaf of holomorphic functions $f$ on $X$ such that the divisor of $f$ plus $p$ is greater than $0$. This means sections of $\mathcal O(p)$ are allowed to have poles as I said. For example $1/(z-p)$ is a global section of $\mathcal O(p)$, but this is certainly not a section on $\mathcal O$.
While hwong557 gave a very helpful answer, it wasn't quite what I was looking for, as I really needed to have an explicit example worked through to see all the (probably obvious to everyone else) connections between different objects. I now think I've understood what's going on in an explicit example, and I'm happy. So I'm going to post what I wrote up for myself here, in case anyone else is like me and needs this spelt out step-by-step in this way.
Here is an explicit example on $\mathbb{P}^1$.
In Huybrechts' Complex Geometry on page 79 we are given the following prescription for constructing a line bundle $\mathcal{O}(D)$ given a Cartier divisor $D$,
If $D=\sum a_i[Y_i]\in\text{Div}(X)$ corresponds to $f \in H^0(X,\mathcal{K}_X^*/\mathcal{O}_X^*)$, which in turn is given by functions $f_i \in \mathcal{K}_X(U_i)$ for an open covering $X= \bigcup U_i$, then we define $\mathcal{O}(D) \in \text{Pic}(X)$ as the line bundle with transition function $\psi_{ij} \mathrel{\mathop:}= f_i \cdot f_j^{-1} \in H^0(U_i \cap U_j,\mathcal{O}_X^*)$.
(Note that $\psi_{ij}\mathrel{\mathop:}=\psi_i\cdot\psi_j^{-1}$ where $\psi_i$ are the local trivialisations $\pi^{-1}(U_i) \cong U_i \times \mathbb{C}^r$. That is, $\psi_{ij}$ is the transition function for moving from $U_j$ to $U_i$.)
Let's work through an explicit example to understand this. Work on the space $X\equiv\mathbb{P}^1$, with homogeneous coordinate $x_0$ and $x_1$, and write 'NR' for $x_1\neq0$ ('north region') and 'SR' for $x_0\neq0$ ('south region').
Let the divisor $D=-p$, where $p$ is the 'north pole' at $x_0=0$. This divisor is defined by the meromorphic functions $\tfrac{x_1}{x_0}$ on $U_{NR}$ and $1$ on $U_{SR}$. By the above prescription for defining the line bundle $\mathcal{O}(-p)$ corresponding to this divisor $-p$, we set the transition function, $$\psi_{SR \to NR}=\frac{x_1}{x_0} \,.$$ We see that this line bundle has no global holomorphic sections, as expected (any holomorphic function on $U_{SR}$ would transition into a function with poles on $U_{NR}$). We also see that this is the line bundle which in a different notation is written $\mathcal{O}(-1)$, again as expected.
Now consider a very different object: the ideal sheaf $\mathscr{I}_p$ of holomorphic functions vanishing on $p$. This is the sheaf that can assign to the open sets in our open covering holomorphic functions of the form, $$ U_{NR}: a_1\left(\frac{x_0}{x_1}\right)+a_2\left(\frac{x_0}{x_1}\right)^2+\ldots \, , $$ $$ U_{SR}: b_0+b_1\left(\frac{x_0}{x_1}\right)+\ldots \, . $$ (More specifically, we can assign a holomorphic function of the first form to any open set containing $p$, and a holomorphic function of the latter form to any open set not containing $p$.) We can see that the sheaf we have defined is a locally free sheaf. This is because on $U_{SR}$ the space of allowed functions is exactly $\mathcal{O}_{SR}$ (the isomorphism $\phi_{SR} : \mathscr{I}_p(U_{SR}) \to \mathcal{O}_{SR}$ is trivial), while on $U_{NR}$ there is an isomorphism $\phi_{NR} : \mathscr{I}_p(U_{NR}) \to \mathcal{O}_{NR}$ given by multiplying by $\tfrac{x_1}{x_0}$. Hence we have an open covering $\{U_{SR},U_{NR}\}$ for which $\mathscr{I}_p(U)$ is isomorphic to $\mathcal{O}_U$ for each of the open sets $U$.
Using the general prescription (outlined below) for associating a line bundle to a locally free sheaf, we assign the transition function, $$ \psi_{SR \to NR} = \phi_{NR} \cdot \phi_{SR}^{-1} = \frac{x_1}{x_0} \, . $$ We see that this is exactly the transition function for $\mathcal{O}(-p)$ that we found above.
This is a counter-intuitive result for the following reason. The sections of $\mathcal{O}(p)$ vanish at $p$, but it is $\mathcal{O}(-p)$ that is given by the line bundle associated to the sheaf $\mathscr{I}_p$ of holomorphic functions vanishing at $p$, and this sounds like a mismatch - given that the sections of $\mathcal{O}(p)$ vanish at $p$, it sounds like it should be this line bundle that corresponds to $\mathscr{I}_p$.
We can see that this 'mismatch' is a result of the definition of $\mathcal{O}(D)$. In the case of $\mathcal{O}(-p)$, the contribution of $NR$ to the transition function $\psi_{SR \to NR}$ is given by the meromorphic function that defines the divisor in that region, i.e. $\tfrac{x_1}{x_0}$, which defines a pole at $x_0=0$. On the other hand, in the case of $\mathscr{I}_p$, the contribution to the transition function (of the line bundle that we construct using the standard prescription for getting from a locally free sheaf to a line bundle), is given by the isomorphism with $\mathcal{O}_{NR}$ given by $\phi_{NR}=\tfrac{x_1}{x_0}$ that by definition undoes the restriction of holomorphic functions in this region to ones that vanish at $p$ - that is, the isomorphism must act by multiplying by a factor that on its own would give a pole at $p$.
Generalities on associating a line bundle to a locally free sheaf:
From page 33 of Aspinwall (and elsewhere) we have the following definition of a locally free sheaf.
We call a sheaf $\mathscr{E}$ locally free of rank $n$ if there is an open covering $\{U_{\alpha}\}$ of $X$ such that $\mathscr{E}(U_{\alpha})\cong\mathcal{O}_X(U_{\alpha})^{\oplus n}$ for all $\alpha$.
Suppose we have a locally free sheaf $\mathscr{E}$, with an open covering satisfying the above condition. Let $\phi_{\alpha} : \mathscr{E}(U_{\alpha}) \to \mathcal{O}_X(U_{\alpha})^{\oplus n}$ be the explicit isomorphism. To get the holomorphic vector bundle corresponding to this locally free sheaf, we are told by Aspinwall to define on each intersection $U_{\alpha} \cap U_{\beta}$ the $n \times n$ matrix of holomorphic functions, $$ \phi_{\beta}\phi_{\alpha}^{-1}: \mathcal{O}_X(U_{\alpha} \cap U_{\beta})^{\oplus n} \to \mathcal{O}_X(U_{\alpha} \cap U_{\beta})^{\oplus n} \, . $$
How does this work really? What are these 'explicit isomorphisms'? Here is an example, at least for the case of a rank-one locally free sheaf (corresponding to a line bundle). Take the space $X \equiv \mathbb{P}^1$, with homogeneous coordinates $x_0$ and $x_1$, and consider the following sheaf. Assign to any open set $U \subseteq X$ the set of polynomials of homogeneous degree $1$ and without poles.
So on the total space, we can assign a polynomial $ax_0+bx_1$, with $a,b \in \mathbb{C}$. ($x_0^2/x_1$ and so on would have poles.) Now consider the open covering consisting of the two open sets $x_0\neq0$ and $x_1\neq0$.
On $x_0\neq0$, we can assign a polynomial $ax_0+bx_1+c\,\tfrac{x_1^2}{x_0}+\ldots$. Similarly on $x_1\neq0$, we can assign a polynomial $cx_1+dx_0+e\,\tfrac{x_0^2}{x_1}+\ldots$. On each of these two open sets, the space of allowed polynomials is isomorphic to the space of holomorphic polynomials - in $x_0\neq0$ the isomorphism (call it $\phi_0$) is given by dividing by $x_0$, while in $x_1\neq0$ the isomorphism (call it $\phi_1$) is given by dividing by $x_1$. Hence we have found an open covering such that the sheaf assigns to each open set an object isomorphic to the space of holomorphic functions on that open set. That is, we have shown that this sheaf is locally free (of rank one).
Further, now that we have the explicit isomorphisms for a locally free sheaf, we can construct the corresponding line bundle by the above general prescription. The transition function from $x_0\neq0$ to $x_1\neq0$ is given by acting with the composition $\phi_1\circ\phi_0^{-1} = \tfrac{x_0}{x_1}$. This is exactly the line bundle $\mathcal{O}(1)$. It has global holomorphic sections corresponding to assigning the holomorphic function $c_0\,\tfrac{x_0}{x_1}+c_1$ on $x_1\neq0$ and the holomorphic function $c_0+c_1\,\tfrac{x_1}{x_0}$ on $x_0\neq0$. From the viewpoint of a locally free sheaf, the global sections are given by homogeneous polynomials $c_0\,x_0+c_1\,x_1$. (Note that the sheaf we have discussed, or its associated line bundle rather, is precisely $\mathcal{O}_{\mathbb{P}^n}(1)$.)
Let us say something general about what we have just said, and about what the case would be we were to take a different degree for the polynomials. If we take arbitrary polynomials of homogeneous degree one on the total space, then this space is not isomorphic to $\mathcal{O}$, since there is no function we can multiply by to kill off all the poles. So what this sheaf assigns to the total space is not isomorphic to $\mathcal{O}$. But on the 'north' and 'south' open sets, we do have isomorphisms between what we assign to these open sets and $\mathcal{O}$ (and since these two open sets provide an open covering, this tells us that we have defined a locally free sheaf). The isomorphisms on the different open sets patch together to tell us about the transition functions. The $n$ in the sheaf $\mathcal{O}(n)$ is encoded in what is the homogeneous degree of the polynomials we allow (on all open sets, including the total space). Making a different choice of this degree will mean that we will have to choose different maps on the north and south regions to give explicit isomorphisms with $\mathcal{O}$ (we will have to divide by different powers of $x_0$ and $x_1$), so the set of isomorphisms that we try to patch together will be different for each $n$, and hence we will get different transition functions, and this will define these different line bundles.