Did merging Black Holes in GW150914 give up entropy and information to the gravitational waves, since they lost 3 solar masses?
It might be popular to oversimplify general relativity and say things like these two black holes have masses $M_1$ and $M_2$ and velocities $\vec v_1$ and $\vec v_2$ and potential energy $U.$ And it can sound appealing to imagine the general relativity is just like Newtonian mechanics but with some corrections and limitations for high speeds and strong forces. But it isn't right.
Firstly, in general relativity when you say an object has mass $M$, you do not mean that if you summed up the mass $m_i$ of each part, they add up to $M.$ In fact $$M\neq m_1+m_2+\dots +m_N.$$
Secondly, energy and mass aren't conserved, and there isn't even always an unambiguous way to even talk about a total energy. The word general in general relativity comes from the general case when you don't have that kinds of global coordinate systems and global frames that allowed you to turn energy densities into energies. It forces the theory to become a local theory.
So lets get at what it means to lose mass by looking at an example of how it could happen without black holes and without waves. Let's say you have a spherical bowling ball shaped planet and a spherical shell of matter surrounding that spherical bowling ball shaped planet. There are lots of ways spacetime can be curved, and sometimes you can label them by parameters with units of mass. So what if spacetime starts out curved like a Schwarzschild type of parameter $M$ in the region between the shell and the ball (so outside the planet but inside the shell) and is curved like a Schwarzschild type of parameter $M+80m$ in the region outside both.
This is totally possible and in fact if $m>0$ this can be done with regular ordinary matter and the more of it we put there the larger we can make $m.$
Now let's say that shell has some springs on the bottom. If we let the shell fall, then when the shell starts to make contact with the planet the springs hit first. The whole time the shell falls you continue to have a solution of Schwarzschild type with a parameter of $M+80m$ in the region outside both.
And this is actually amazing becasue Einstein's Equation says that it is energy, not mass that is the source of gravity and the shell is falling, falling faster and faster. But what we call a solution of Schwarzschild type with parameter $M+80m$ refers to solutions that could be generated by particles with less energy that are farther out, or by particles with more energy that in closer.
So as the shell falls it changes from a low energy shell that is farther out into a higher energy shell that is closer in. But those things generate the same type, specifically the same parameter.
As the springs make contact the shell slows down. But the energy of the springs goes up by an equal amount and you continue to have a solution of Schwarzschild type with a parameter of $M+80m$ in the region outside the shell. Now you can take all the springs and use them to power a fancy particle accelerator and use it to create a bunch of matter and antimatter using the energy. Save some of the energy too, don't use it all now.
Matter and antimatter both have regular positive energy. And both have regular positive mass. And you continue to have a solution of Schwarzschild type with a parameter of $M+80m$ in the region outside the shell. You just have a larger number of particles because you have these additional particles you didn't have before and you have all their antiparticles which you also didn't have before. But you continue to have a solution of Schwarzschild type with a parameter of $M+80m$ in the region outside the shell.
Now you use the new particles to make a shell of matter and give it some energy, enough to have escape velocity. And you use the new antiparticles to make a shell of antimatter and give it some energy, enough to have escape velocity. And each shell has positive energy and positive mass and so now you have three shells, the matter shell, the antimatter shell, the old shell, and there is the original planet too.
Outside all of the shells you continue to have a solution of Schwarzschild type with a parameter of $M+80m$ in the region outside all the shells. But you have a solution of Schwarzschild type with a parameter of $M+79m$ in the region between the matter shell and the antimatter shell. And you have a solution of Schwarzschild type with a parameter of $M+78m$ in the region between the antimatter shell and the old shell. And those shells are taking off like a rocket.
To someone far away everything looked like a regular parameter $M+80m$ solution with the images getting a bit smaller as the old shell collapsed and then got larger as you sent out the matter shell. But the matter shell gets thinner as it goes up and eventually like a balloon that gets so big it reaches you the shell reaches that far away person. And once it does, they don't notice very much since it is so very very thin (they were far away) but they are now inside so they see a solution of parameter $M+79m$ and they see a second shell (though it is super thin too) and when that passes they are now inside so they see a solution of parameter $M+78m.$
So they see every single particle of the old shell is still there and every single particle of the ball is still there. But the parameter value decreased from $M+80m$ to $M+78m$ and recall that $M$ and $m$ were in kilograms, so they might say something like the mass of that planetary system decreased. Even though every particle is still there and the mass of none of them changed. They are just closer together now and some of the energy they would naturally get when they got closer together is gone now. And that makes a solution with a smaller parameter, it makes $M+78m$ instead of $M+80m.$
So we learned a bit and we learned that the parameter value assigned to that planetary system wasn't just based on how many particles there were and what their individual masses was, but it depended and the arrangement and motion of the parts relative to each other as well.
The same issue happens with the black holes. There was a system of both holes and the system's parameters depended not just on the parameters of each hole, but also on how far apart they were from each other and how they were moving relative to each other.
As they moved, the waves are like the springs they make the holes move slower in a sense than they would without waves, and that energy in a sense is now part of the wave. And if the wave is still there it could all look the same to us. Just like that planet system looked the same until that spring energy was turned into something (the matter and antimatter shells) that could be sent to us.
Those black holes are actually stars, they are collections of hydrogen and helium and electrons and neutrons and protons and such. They just act super similar to what a black hole with a certain parameter would act like. And we are influenced by those objects all from the times and places before an event horizon formed.
We don't know that event horizons form. Maybe instead of crossing an event horizon, particle disappear when they reach it. We don't actually know for sure. And this is key. It's because we are always affected by things before they cross and never during or after. So you might want to say they crossed. But the things affecting you are the things from before they crossed. That's the things that matter.
Do what happens is the things that make up the two stars don't speed up as much as they would without waves, and the waves travel outwards. So if far away things were very very close to Kerr type of parameter $(M+m,J+j)$ then in between the wave and the stars you might get something closer to Kerr type of parameter $(M,J).$
So us far away we keep seeing Kerr type of parameter $(M+m,J+j)$ up until that wave passed us and then we see a Kerr type of parameter $(M,J).$ Very much like planetary example I gave.
Now that I've address most of the common misconceptions (though this is still pretty handwavy) we could look at your questions.
Since the final Black Hole (BH) had 3 solar masses less of mass than the original binary BH,
it seems the 2 BHs lost mass, and with it event surface area, entropy, and information.
Lets denote the surface area of a black hole of mass $M$ by $A(M).$ Then it is basic black hole thermodynamics that $$A(M_1+M_2)>A(M_1)+A(M_2).$$
When the black holes of mass $M_1$ and $M_2$ merge into a black hole of mass $M$ and some waves you'll find that $A(M)>A(M_1)+A(M_2).$ So the surface area increases even if you had eternal black holes instead of stars that are well approximated by black holes.
If this seems unintuitive, imagine a point in between the two black holes. You could imagine time as the z direction. Then the event horizons are like tubes. So there is a tube and another tube and they spiral around like a pair of paper towel rolls placed on opposite ends of record being played. But as they spread out and spiral in there us a point where the center no longer has a chance to escape. At that event a point grows into its own cylinder and expands to be a bigger radius cylinder and a bigger radius cylinder until it reaches the other event horizons.
It was the other two tilting and entwining around each other like braids that made the barrier. And the existence of the barrier is what defined the event inside that can't escape.
If you had some bars in a bird cage, you can escape if you are small enough. But if you twist the bars around tighter and tighter then eventually they touch. The place they touch is the barrier and like an ice cream cone with the point pointing towards the past the cone backwards defines where the new horizon forms.
So a new horizon starts to form way in the center of the black holes and it jumps outwards at the speed of light and the waves escape between the bars before they touch. Since there was lots of volume between the black holes there was lots of room to create a bigger horizon out of the joint system.
And besides, its normal for black holes to create space so its not hard to create more area as they whirl.
Observation of GW150914 appears to be also consistent with Black hole Thermodynamics, according to which the event horizon area of the final black hole must be larger than the sum of the event horizon areas of the binary components.
To see this, we consider first the initial total area of the event horizons, assuming the binary components to be Schwarzschild black holes,
$$A_i = 16 \pi (G/c^2)^2 [M^2_1 + M^2_2] \tag 1$$
where $M_1 = 29 \ M_\odot $ and $M_2 = 36 \ M_\odot \ .$
The black hole formed after the merger has a mass $M_f = 62 \ M_\odot $ and spin angular momentum $L = 0.67 \ \frac {G M^2_f} {c} $.
Hence, the final area of the event horizon (appropriate for a Kerr black hole) is,
$$A_f = 8 \pi (G/c^2)^2 M_f \bigg (M_f + \sqrt{M^2_f - (L_f \ c/G\ M_f)^2} \bigg ) \tag 2$$
Therefore, the ratio,
$$\frac {A_f}{A_i} = \frac { M^2_f \bigg (1+ \sqrt{1 - (L_f\ c/G \ M^2_f)^2} \bigg )}{2 \bigg (M^2_1 + M^2_2 \bigg )} = 1.57 \tag 3$$
where one has used,
$$L\ c/G\ M^2_f = 0.67 \tag 4$$
as given in Abbott et. al, PRL 116, 061102 (2016).
Thus, equation (3) demonstrates that Black hole Thermodynamics is intact after the event.