Why are my submissions not accepted by top conferences, while similar papers from others are accepted?
One thing I do not have in my papers is bringing unnecessary but sophisticated math jargon to show something important is happening. For example, I noticed in other's papers although the contribution was for instance adding an extra objective term in the optimization framework, they used complicated algebraic representations or visualization techniques to related their contribution to some bigger underlying phenomena! But the short answer is they did it because it obviously should make the outcome result better.
I can think of several ways to interpret this observation, some more charitable than others.
Computer science research papers are not about the results per se, but about the techniques developed to attain those results. It's possible that in the revirewers' eyes, the apparent connection to the larger underlying phenomenon is the main contribution of the paper. The main contribution is not merely that the outcome is better, but at least an attempted explanation of why the outcome is better.
People are bad judges of the quality their own research results. More to the point: Your opinion of your research is irrelevant; only your peers' opinions actually matter. If they find your papers less interesting than others, then by definition your papers are less interesting.
Emerson was wrong. The world will not beat a path to your door if you merely build a better mousetrap. You have to sell your results. For your papers to be accepted, they must follow the cultural expectations of your audience. If the IJCAI/ICML/CVPR audience expects a certain level of mathematical sophistication, then papers that do not display that sophistication are less likely to be accepted, even if that sophistication is unnecessary.
It is very easy to confuse mathematical depth/complexity with importance/difficulty. Top theoretical computer science conferences have a reputation for preferring more mathematically "difficult" papers to papers using more elementary techniques, and lots of reviewers assume without justification that any result that appears straightforward in retrospect must have been easy to derive. But unless someone proves that P=NP, "trivial" is different from "nondeterministically trivial".
Because CVPR, ICML, and IJCAI are enormous conferences with low acceptance rates, acceptance decisions have high variance. Given the breadth and complexity of the field, the ridiculous number of submissions, and the limited time for reviews, it is impossible for the program committee to make fully informed judgements about every submission. There is an element of randomness even at smaller conferences, but for larger prestigious conferences, the randomness overwhelms the actual "signal". In 2014, NIPS ran an experiment with two independent program committees; most papers accepted by one committee were rejected by the other. It amazes me that anyone found this result surprising.
PC members are apes; they do apey things. As in any other large community, there are sub-communities within machine learning that prefer their own papers to others. Even though submissions are blinded, reviewers can identify tribal affiliation—if only subconsciously—by writing style, citation patterns, choice of method, choice of data set, or choice of evaluation metrics. If you're not in the right tribe, your papers are less likely to be accepted.
they used complicated algebraic representations or visualization techniques to related their contribution to some bigger underlying phenomena! But the short answer is they did it because it obviously should make the outcome result better.
This (relating your contribution to the bigger picture) is an important part in papers in our fields, so I guess in yours, too. The reviewers/readers don't know the bigger picture of your specific research, you need to tell them. Sounds like you don't, whereas others - while doing similar quality research - do.
One thing I've experienced is not to 'overfit' to accepted papers that have been just published. It is very easy to look at a published paper and say "this isn't really very novel! I can do this too!", and then proceed to use this as a benchmark for minimum viable novelty. I also noticed when aiming at the borderline, papers tend to be rejected more. Moreover, there is also the time element. Technical novelty is w.r.t to time.
Then again, I don't think it's healthy to assume one's work is up to the novelty bar. You may think your work is actually novel but it might not. Reviews and acceptance/rejection results tell the truth, to some extent. There is some variance in results and sometimes luck is not on your side. But I always believe a reasonably okay paper should get into a top conference after 1-2 more tries after it has been rejected once.
What kinds of scores do you normally get? There are many different extents to how a paper gets rejected. There is a big difference between 1 strong accept + 2 reject and 3 x weak rejects.