Why does a SOAP message have to be sent over HTTP?
SOAP can be sent over different transports. HTTP is just one of them.
For example: SMTP, TCP/IP
Overview
SOAP is a messaging protocol and in a nutshell is just another XML language.
Its purpose is the data exchange over networks. Its concern is the encapsulation of these data and the rules for transmitting and receiving them.
HTTP is an application protocol and SOAP messages are placed as the HTTP payload.
Although there is the overhead of HTTP, it has the advantage that it is a protocol that is open to firewalls, well-understood and widely-supported. Thus, web services can be accessed and exposed via technology already in-place.
SOAP messages are usually exchanged via HTTP. Although it is possible to use other (application) protocols, e.g. SMTP or FTP, the non-HTTP bindings are not specified by SOAP specs and are not supported by WS-BP (interoperability spec).
You could exchange SOAP messages over raw TCP but then you would have web services that are not interoperable (not compliant to WS-BP).
Nowadays the debate is why have the SOAP overhead at all and not send data over HTTP (RESTful WS).
Why use HTTP for SOAP?
I will try to address in more detail the question in the OP, asking why use HTTP for SOAP:
First of all SOAP defines a data encapsulation format and that's that.
Now the majority of traffic in the web is via HTTP. HTTP is literary EVERYWHERE and supported by a well-established infrastructure of servers and clients(namely browsers). Additionally it is a very well understood protocol.
The people who created SOAP wanted to use this ready infrastructure and
- SOAP messages were designed so that they can be tunneled over HTTP
- In the specs they do not refer to any other non-HTTP binding but specifically refer to HTTP as an example for transfer.
The tunneling over HTTP would and did help in it's rapid adoption. Because the infrastructure of HTTP is already in-place, companies would not have to spend extra money for another kind of implementation. Instead they can expose and access web services using technology already deployed.
Specifically in Java a web service can be deployed either as a servlet endpoint or as an EJB endpoint. So all the underlying network sockets, threads, streams, HTTP transactions etc. are handled by the container and the developer focuses only on the XML payload.
So a company has Tomcat or JBoss running in port 80 and the web service is deployed and accessible as well.
There is no effort to do programming at the transport layer and the robust container handles everything else.
Finally the fact that firewalls are configured not to restrict HTTP traffic is a third reason to prefer HTTP.
Since HTTP traffic is usually allowed, the communication of clients/servers is much easier and web services can function without network security blockers issues as a result of the HTTP tunneling.
SOAP is XML=plain text so firewalls could inspect the content of HTTP body and block accordingly. But in this case they could also be enhanced to reject or accept SOAP depending on the contents.This part which seems to trouble you is not related to web services or SOAP, and perhaps you should start a new thread concerning how firewalls work.
Having said that, the fact that HTTP traffic is unrestricted often causes security issues since firewalls are essentially by-passed, and that is why application-gateways come in.
But this is not related to this post.
Summary
So to sum up the reasons for using HTTP:
- HTTP is popular and successful.
- HTTP infrastructure is in place so no extra cost to deploy web services.
- HTTP traffic is open to firewalls, so there are no problems during web service functioning as a result of network security.