Philosophy behind Mochizuki's work on the ABC conjecture
I would have preferred not to comment seriously on Mochizuki's work before much more thought had gone into the very basics, but judging from the internet activity, there appears to be much interest in this subject, especially from young people. It would obviously be very nice if they were to engage with this circle of ideas, regardless of the eventual status of the main result of interest. That is to say, the current sense of urgency to understand something seems generally a good thing. So I thought I'd give the flimsiest bit of introduction imaginable at this stage. On the other hand, as with many of my answers, there's the danger I'm just regurgitating common knowlege in a long-winded fashion, in which case, I apologize.
For anyone who wants to really get going, I recommend as starting point some familiarity with two papers, 'The Hodge-Arakelov theory of elliptic curves (HAT)' and 'The Galois-theoretic Kodaira-Spencer morphism of an elliptic curve (GTKS).' [It has been noted here and there that the 'Survey of Hodge Arakelov Theory I,II' papers might be reasonable alternatives.][I've just examined them again, and they really might be the better way to begin.] These papers depart rather little from familiar language, are essential prerequisites for the current series on IUTT, and will take you a long way towards a grasp at least of the motivation behind Mochizuki's imposing collected works. This was the impression I had from conversations six years ago, and then Mochizuki himself just pointed me to page 10 of IUTT I, where exactly this is explained. The goal of the present answer is to decipher just a little bit those few paragraphs.
The beginning of the investigation is indeed the function field case (over $\mathbb{C}$, for simplicity), where one is given a family $$f:E \rightarrow B$$ of elliptic curves over a compact base, best assumed to be semi-stable and non-isotrivial. There is an exact sequence $$0\rightarrow \omega_E \rightarrow H^1_{DR}(E) \rightarrow H^1(O_E)\rightarrow0,$$ which is moved by the logarithmic Gauss-Manin connection of the family. (I hope I will be forgiven for using standard and non-optimal notation without explanation in this note.) That is to say, if $S\subset B$ is the finite set of images of the bad fibers, there is a log connection $$H^1_{DR}(E) \rightarrow H^1_{DR}(E) \otimes \Omega_B(S),$$ which does not preserve $\omega_E$. This fact is crucial, since it leads to an $O_B$-linear Kodaira-Spencer map $$KS:\omega \rightarrow H^1(O_E)\otimes \Omega_B(S),$$ and thence to a non-trivial map $$\omega_E^2\rightarrow \Omega_B(S).$$ From this, one easily deduces Szpiro's inequality: $$\deg (\omega_E) \leq (1/2)( 2g_B-2+|S|).$$ At the most simple-minded level, one could say that Mochizuki's programme has been concerned with replicating this argument over a number field $F$. Since it has to do with differentiation on $B$, which eventually turns into $O_F$, some philosophical connection to $\mathbb{F}_1$-theory begins to appear. I will carry on using the same notation as above, except now $B=Spec(O_F)$.
A large part of HAT is exactly concerned with the set-up necessary to implement this idea, where, roughly speaking, the Galois action has to play the role of the GM connection. Obviously, $G_F$ doesn't act on $H^1_{DR}(E)$. But it does act on $H^1_{et}(\bar{E})$ with various coefficients. The comparison between these two structures is the subject of $p$-adic Hodge theory, which sadly works only over local fields rather than a global one. But Mochizuki noted long ago that something like $p$-adic Hodge theory should be a key ingredient because over $\mathbb{C}$, the comparison isomorphism $$H^1_{DR}(E)\simeq H^1(E(\mathbb{C}), \mathbb{Z})\otimes_{\mathbb{Z}} O_B$$ allows us to completely recover the GM connection by the condition that the topological cohomology generates the flat sections.
In order to get a global arithmetic analogue, Mochizuki has to formulate a discrete non-linear version of the comparison isomorphism. What is non-linear? This is the replacement of $H^1_{DR}$ by the universal extension $$E^{\dagger}\rightarrow E,$$ (the moduli space of line bundles with flat connection on $E$) whose tangent space is $H^1_{DR}$ (considerations of this nature already come up in usual p-adic Hodge theory). What is discrete is the \'etale cohomology, which will just be $E[\ell]$ with global Galois action, where $\ell$ can eventually be large, on the order of the height of $E$ (that is $\deg (\omega_E)$). The comparison isomorphism in this context takes the following form: $$\Xi: A_{DR}=\Gamma(E^{\dagger}, L)^{<\ell}\simeq L|E[\ell]\simeq (L|e_{E})\otimes O_{E[\ell]}.$$ (I apologize for using the notation $A_{DR}$ for the space that Mochizuki denotes by a calligraphic $H$. I can't seem to write calligraphic characters here.) Here, $L$ is a suitably chosen line bundle of degree $\ell$ on $E$, which can then be pulled back to $E^{\dagger}$. The inequality refers to the polynomial degree in the fiber direction of $E^{\dagger} \rightarrow E$. The isomorphism is effected via evaluation of sections at $$E^{\dagger}[\ell]\simeq E[\ell].$$ Finally, $$ L|E[\ell]\simeq (L|e_{E})\otimes O_{E[\ell]}$$ comes from Mumford's theory of theta functions. The interpretation of the statement is that it gives an isomorphism between the space of functions of some bounded fiber degree on non-linear De Rham cohomology and the space of functions on discrete \'etale cohomology. This kind of statement is entirely due to Mochizuki. One sometimes speaks of $p$-adic Hodge theory with finite coefficients, but that refers to a theory that is not only local, but deals with linear De Rham cohomology with finite coefficients.
Now for some corrections: As stated, the isomorphism is not true, and must be modified at the places of bad reduction, the places dividing $\ell$, and the infinite places. This correction takes up a substantial portion of the HAT paper. That is, the isomorphism is generically true over $B$, but to make it true everywhere, the integral structures must be modified in subtle and highly interesting ways, while one must consider also a comparison of metrics, since these will obviously figure in an arithmetic analogue of Szpiro's conjecture. The correction at the finite bad places can be interpreted via coordinates near infinity on the moduli stack of elliptic curves as the subtle phenomenon that Mochizuki refers to as 'Gaussian poles' (in the coordinate $q$). Since this is a superficial introduction, suffice it to say for now that these Gaussian poles end up being a major obstruction in this portion of Mochizuki's theory.
In spite of this, it is worthwhile giving at least a small flavor of Mochizuki's Galois-theoretic KS map. The point is that $A_{DR}$ has a Hodge filtration defined by
$F^rA_{DR}= \Gamma(E^{\dagger}, L)^{ < r} $
(the direction is unconventional), and this is moved around by the Galois action induced by the comparison isomorphism. So one gets thereby a map $$G_F\rightarrow Fil (A_{DR})$$ into some space of filtrations on $A_{DR}$. This is, in essence, the Galois-theoretic KS map. That, is if we consider the equivalence over $\mathbb{C}$ of $\pi_1$-actions and connections, the usual KS map measures the extent to which the GM connection moves around the Hodge filtration. Here, we are measuring the same kind of motion for the $G_F$-action.
This is already very nice, but now comes a very important variant, essential for understanding the motivation behind the IUTT papers. In the paper GTKS, Mochizuki modified this map, producing instead a 'Lagrangian' version. That is, he assumed the existence of a Lagrangian Galois-stable subspace $G^{\mu}\subset E[l]$ giving rise to another isomorphism $$\Xi^{Lag}:A_{DR}^{H}\simeq L\otimes O_{G^{\mu}},$$ where $H$ is a Lagrangian complement to $G^{\mu}$, which I believe does not itself need to be Galois stable. $H$ is acting on the space of sections, again via Mumford's theory. This can be used to get another KS morphism to filtrations on $A_{DR}^{H}$. But the key point is that
$\Xi^{Lag}$, in contrast to $\Xi$, is free of the Gaussian poles
via an argument I can't quite remember (If I ever knew).
At this point, it might be reasonable to see if $\Xi^{Lag}$ contributes towards a version of Szpiro's inequality (after much work and interpretation), except for one small problem. A subspace like $G^{\mu}$ has no reason to exist in general. This is why GTKS is mostly about the universal elliptic curve over a formal completion near $\infty$ on the moduli stack of elliptic curves, where such a space does exists. What Mochizuki explains on IUTT page 10 is exactly that the scheme-theoretic motivation for IUG was to enable the move to a single elliptic curve over $B=Spec(O_F)$, via the intermediate case of an elliptic curve 'in general position'.
To repeat:
A good 'nonsingular' theory of the KS map over number fields requires a global Galois invariant Lagrangian subspace $G^{\mu}\subset E[l]$.
One naive thought might just be to change base to the field generated by the $\ell$-torsion, except one would then lose the Galois action one was hoping to use. (Remember that Szpiro's inequality is supposed to come from moving the Hodge filtration inside De Rham cohomology.) On the other hand, such a subspace does often exist locally, for example, at a place of bad reduction. So one might ask if there is a way to globally extend such local subspaces.
It seems to me that this is one of the key things going on in the IUTT papers I-IV. As he say in loc. cit. he works with various categories of collections of local objects that simulate global objects. It is crucial in this process that many of the usual scheme-theoretic objects, local or global, are encoded as suitable categories with a rich and precise combinatorial structure. The details here get very complicated, the encoding of a scheme into an associated Galois category of finite \'etale covers being merely the trivial case. For example, when one would like to encode the Archimedean data coming from an arithmetic scheme (which again, will clearly be necessary for Szpiro's conjecture), the attempt to come up with a category of about the same order of complexity as a Galois category gives rise to the notion of a Frobenioid. Since these play quite a central role in Mochizuki's theory, I will quote briefly from his first Frobenioid paper:
'Frobenioids provide a single framework [cf. the notion of a "Galois category"; the role of monoids in log geometry] that allows one to capture the essential aspects of both the Galois and the divisor theory of number fields, on the one hand, and function fields, on the other, in such a way that one may continue to work with, for instance, global degrees of arithmetic line bundles on a number field, but which also exhibits the new phenomenon [not present in the classical theory of number fields] of a "Frobenius endomorphism" of the Frobenioid associated to a number field.'
I believe the Frobenioid associated to a number field is something close to the finite \'etale covers of $Spec(O_F)$ (equipped with some log structure) together with metrized line bundles on them, although it's probably more complicated. The Frobenious endomorphism for a prime $p$ is then something like the functor that just raises line bundles to the $p$-th power. This is a functor that would come from a map of schemes if we were working in characteristic $p$, but obviously not in characteristic zero. But this is part of the reason to start encoding in categories:
We get more morphisms and equivalences.
Some of you will notice at this point the analogy to developments in algebraic geometry where varieties are encoded in categories, such as the derived category of coherent sheaves. There as well, one has reconstruction theorems of the Orlov type, as well as the phenomenon of non-geometric morphisms of the categories (say actions of braid groups). Non-geometric morphisms appear to be very important in Mochizuki's theory, such as the Frobenius above, which allows us to simulate characteristic $p$ geometry in characteristic zero. Another important illustrative example is a non-geometric isomorphism between Galois groups of local fields (which can't exist for global fields because of the Neukirch-Uchida theorem). In fact, I think Mochizuki was rather fond of Ihara's comment that the positive proof of the anabelian conjecture was somewhat of a disappointment, since it destroys the possibility that encoding curves into their fundamental groups will give rise to a richer category. Anyways, I believe the importance of non-geometric maps of categories encoding rather conventional objects is that
they allow us to glue together several standard categories in nonstandard ways.
Obviously, to play this game well, some things need to be encoded in rigid ways, while others should have more flexible encodings.
For a very simple example that gives just a bare glimpse of the general theory, you might consider a category of pairs $$(G,F),$$ where $G$ is a profinite topological group of a certain type and $F$ is a filtration on $G$. It's possible to write down explicit conditions that ensure that $G$ is the Galois group of a local field and $F$ is its ramification filtration in the upper numbering (actually, now I think about it, I'm not sure about 'explicit conditions' for the filtration part, but anyways). Furthermore, it is a theorem of Mochizuki and Abrashkin that the functor that takes a local field to the corresponding pair is fully faithful. So now, you can consider triples $$(G,F_1, F_2),$$ where $G$ is a group and the $F_i$ are two filtrations of the right type. If $F_1=F_2$, then this 'is' just a local field. But now you can have objects with $F_1\neq F_2$, that correspond to strange amalgams of two local fields.
As another example, one might take a usual global object, such as $$ (E, O_F, E[l], V)$$ (where $V$ denotes a collection of valuations of $F(E[l])$ that restrict bijectively to the valuations $V_0$ of $F$), and associate to it a collection of local categories indexed by $V_0$ (something like Frobenioids corresponding to the $E_v$ for $v\in V_0$). One can then try to glue them together in non-standard ways along sub-categories, after performing a number of non-standard transformations. My rough impression at the moment is that the 'Hodge theatres' arise in this fashion. [This is undoubtedly a gross oversimplification, which I will correct in later amendments.] You might further imagine that some construction of this sort will eventually retain the data necessary to get the height of $E$, but also have data corresponding to the $G^{\mu}$, necessary for the Lagrangian KS map. In any case, I hope you can appreciate that a good deal of 'dismantling' and 'reconstructing,' what Mochizuki calls surgery, will be necessary.
I can't emphasize enough times that much of what I write is based on
faulty memory and guesswork. At best, it is superficial, while at worst,
it is (not even) wrong. [In particular, I am no longer sure that the GTKS map is used in an entirely direct fashion.]
I have not yet done anything with the current papers than give them a cursory glance.
If I figure out more in the coming weeks, I will make corrections.
But in the meanwhile, I do hope what I wrote here is mostly more helpful than misleading.
Allow me to make one remark about set theory, about which I know next to nothing. Even with more straightforward papers in arithmetic geometry, the question sometimes arises about Grothendieck's universe axiom, mostly because universes appear to be used in SGA4. Usually, number-theorists (like me) neither understand, nor care about such foundational matters, and questions about them are normally met with a shrug. The conventional wisdom of course is that any of the usual theorems and proofs involving Grothendieck cohomology theories or topoi do not actually rely on the existence of universes, except general laziness allows us to insert some reference that eventually follows a trail back to SGA4. However, this doesn't seem to be the case with Mochizuki's paper. That is, universes and interactions between them seem to be important actors rather than conveniences. How this is really brought about, and whether more than the universe axiom is necessary for the arguments, I really don't understand enough yet to say. In any case, for a number-theorist or an algebraic geometer, I would guess it's still prudent to acquire a reasonable feel for the 'usual' background and motivation (that is, HAT, GTKS, and anabelian things) before worrying too much about deeper issues of set theory.
I'll take a stab at answering this controversial question in a way that might satisfy the OP and benefit the mathematical community. I also want to give some opinions that contrast with or at least complement grp. Like others, I must give the caveats: I do not understand Mochizuki's claimed proof, his other work, and I make no claims about the veracity of his recent work.
First, some background which might satisfy the OP. For years, Mochizuki has been working on things related to Grothendieck's anabelian program. Here is why one might hope this is useful in attacking problems like ABC:
Begin with the Neukirch-Uchida theorem. See "Über die absoluten Galoisgruppen algebraischer Zahlkörper," by J. Neukirch, Journées Arithmétiques de Caen (Univ. Caen, Caen, 1976), pp. 67–79. Asterisque, No. 41-42, Soc. Math. France, Paris, 1977. Also "Isomorphisms of Galois groups," by K. Uchida, J. Math. Soc. Japan 28 (1976), no. 4, 617–620.
The main result of these papers is that a number field is determined by its absolute Galois group in the following sense: fix an algebraic closure $\bar Q / Q$, and two number fields $K$ and $L$ in $\bar Q$. Then if $\sigma: Gal(\bar Q / K) \rightarrow Gal(\bar Q / L)$ is a topological isomorphism of groups, then $\sigma$ extends to an inner automorphism $Int(\tau): g \mapsto \tau g \tau^{-1}$ of $Gal(\bar Q / Q)$. Thus $\tau$ conjugates the number field $K$ to the number field $L$, and they are isomorphic.
So while class field theory guarantees that the absolute Galois group $Gal(\bar Q / K)$ determines (the profinite completion of) the multiplicative group $K^\times$, the Neukirch-Uchida theorem guarantees that the entire field structure is determined by the profinite group structure of the Galois group. Figuring out how to recover aspects of the field structure of $K$ from the profinite group structure of $Gal(\bar Q / K)$ is a difficult corner of number theory.
Next, consider a (smooth) curve $X$ over $Q$; suppose that the fundamental group $\pi_1(X({\mathbb C}))$ is nonabelian. Let $\pi_1^{geo}(X)$ be the profinite completion of this nonabelian group. Basic properties of the etale fundamental group give a short exact sequence: $$1 \rightarrow \pi_1^{geo}(X) \rightarrow \pi_1^{et}(X) \rightarrow Gal(\bar Q / Q) \rightarrow 1.$$
Now, just as one can ask about recovering a number field from its absolute Galois group ($Gal(\bar Q / K)$ is isomorphic to $\pi_1^{et}(K)$), one can ask how much one can recover about the curve $X$ from its etale fundamental group. Any $Q$-point $x$ of $X$, i.e. map of schemes from $Spec(Q)$ to $Spec(X)$ gives a section $s_x: Gal(\bar Q / Q) \rightarrow \pi_1^{et}(X)$.
One case of the famous "section conjecture" of Grothendieck states that this gives a bijection from $X(Q)$ to the set of homomorphisms $Gal(\bar Q / Q) \rightarrow \pi_1^{et}(X)$ splitting the above exact sequence. One hopes, more generally, to recover the structure of $X$ as a curve over $Q$ from the induced outer action of $Gal(\bar Q / Q)$ on $\pi_1^{geo}(X)$. (take an element $\gamma \in Gal(\bar Q / Q)$, lift it to $\tilde \gamma \in \pi_1^{et}(X)$, and look at conjugation of the normal subgroup $\pi_1^{geo}(X)$ by $\tilde \gamma$, well-defined up to inner automorphism independently of the lift.)
As in the case of the Neukirch-Uchida theorem, there is an active and difficult corner of number theory devoted to recovering properties of rational points of (hyperbolic) curves from etale fundamental groups. Here are two dramatically difficult problems in the same spirit:
How can you describe the regulator of a number field $K$ from the structure of the profinite group $Gal(\bar Q / K)$?
Given a section $s: Gal(\bar Q / Q) \rightarrow \pi_1^{et}(X)$, how can one describe the height of the corresponding point in $X(Q)$?
I would place Mochizuki's work in this anabelian corner of number theory; I have always kept a safe and respectful distance from this corner.
Now, to say something not quite as ancient that I gleaned from flipping through Mochizuki's recent work:
Many people here on MO and elsewhere have been following research on the field with one element. It is a tempting object to seek, because analogies between number fields and function fields break down quickly when you realize there is no "base scheme" beneath $Spec(Z)$. But I see Mochizuki's work as an anabelian approach to this problem, and I'll try to describe my understanding of this below.
Consider a smooth curve $X$ over a function field $F_p(T)$. The anabelian approach suggests looking at the short exact sequence $$1 \rightarrow \pi_1^{et}(X_{\overline{F_p(T)}}) \rightarrow \pi_1^{et}(X) \rightarrow Gal(\overline{F_p(T)} / F_p(T)) \rightarrow 1.$$ But much more profitable is to look instead at $X$ as a surface over $F_p$ which corresponds in the anabelian perspective to studying $$1 \rightarrow \pi_1^{et}(X_{\bar F_p}) \rightarrow \pi_1^{et}(X) \rightarrow Gal(\bar F_p / F_p) \rightarrow 1.$$ But this is pretty close to looking at $\pi_1^{et}(X)$ by itself; there's just a little profinite $\hat Z$ quotient floating around, but this can be characterized (I think) group theoretically within the study of $\pi_1^{et}(X)$ itself.
I would understand (after reading Mochizuki) that looking at curves $X$ over function fields $F_p(T)$ as surfaces over $F_p$ is like looking at only the etale fundamental group $\pi_1^{et}(X)$ without worrying about the map to $Gal(\overline{F_p(T)} / F_p(T))$.
So, the natural number field analogue would be the following. Consider a smooth curve $X$ over $Q$. In fact, let's make $X = E - \{ 0 \}$ be a once-punctured elliptic curve over $Q$. Then the absolute anabelian geometry suggests that to study $X$, it should be profitable to study the etale fundamental group $\pi_1^{et}(X)$ all by itself as a profinite group. This is the anabelian analogue of what others might call "studying (a $Z$-model of) $X$ as a surface over the field with one element".
Without understanding any of the proofs in Mochizuki, I think that his work arises from this absolute anabelian perspective of understanding the arithmetic of once-punctured elliptic curves over $Q$ from their etale fundamental groups. The ABC conjecture is equivalent to Szpiro's conjecture which is a conjecture about the arithmetic of elliptic curves over $Q$.
Now here is a suggestion for number theorists who, like myself, have unfortunately ignored this anabelian corner. Let's try to read the papers of Neukirch and/or Uchida to get a start, and let's try to understand Minhyong Kim's work on Siegel's Theorem ("The motivic fundamental group of $P^1 \backslash ( 0, 1, \infty )$ and the theorem of Siegel," Invent. Math. 161 (2005), no. 3, 629–656.)
It would be wonderful if, while we're waiting for the experts to weight in on Mochizuki's work, we took some time to revisit some great results in the anabelian program. If anyone wants to start a reading group / discussion blog on these papers, I would enjoy attending and discussing.
Last revision: 10/20. (Probably the last for at least some time to come: until Mochizuki uploads his revisions of IUTT-III and IUTT-IV. My apology for the multiple revisions. )
Completely rewritten. (9/26)
It seems indeed that nothing like Theorem 1.10 from Mochizuki's IUTT-IV could hold.
Here is an infinite set of counterexamples, assuming for convenience two standard conjectures (the first being in fact a consequence of ABC), that contradict Thm. 1.10 very badly.
Assumptions:
A (Consequence of ABC) For all but finitely many elliptic curves over $\mathbb{Q}$, the conductor $N$ and the minimal discriminant $\Delta$ satisfy $\log{|\Delta|} < (\log{N})^2$.
B (Uniform Serre Open Image conjecture) For each $d \in \mathbb{N}$, there is a constant $c(d) < \infty$ such that for every number field $F/\mathbb{Q}$ with $[F:\mathbb{Q}] \leq d$, and every non-CM elliptic curve $E$ over $F$, and every prime $\ell \geq c(d)$, the Galois representation of $G_F$ on $E[\ell]$ has full image $\mathrm{GL}_2(\mathbb{Z}/{\ell})$. (In fact, it is sufficient to take the weaker version in which $F$ is held fixed. )
Further, as far as I can tell from the proof of Theorem 1.10 of IUTTIV, the only reason for taking $F := F_{\mathrm{tpd}}\big( \sqrt{-1}, E_{F_{\mathrm{tpd}}}[3\cdot 5] \big)$ --- rather than simply $F := F_{\mathrm{tpd}}(\sqrt{-1})$ --- was to ensure that $E$ has semistable reduction over $F$. Since I will only work in what follows with semistable elliptic curves over $\mathbb{Q}$, I will assume, for a mild technical convenience in the examples below, that for elliptic curves already semistable over $F_{\mathrm{tpd}}$, we may actually take $F := F_{\mathrm{tpd}}(\sqrt{-1})$ in Theorem 1.10.
The infinite set of counterexamples. They come from Masser's paper [Masser: Note on a conjecture of Szpiro, Asterisque 1990], as follows. Masser has produced an infinite set of Frey-Hellougarch (i.e., semistable and with rational 2-torsion) elliptic curves over $\mathbb{Q}$ whose conductor $N$ and minimal discriminant $\Delta$ satisfy $$ (1) \hspace{3cm} \frac{1}{6}\log{|\Delta|} \geq \log{N} + \frac{\sqrt{\log{N}}}{\log{\log{N}}}. $$ (Thus, $N$ in these examples may be taken arbitrarily large. ) By (A) above, taking $N$ big enough will ensure that $$ (2) \hspace{3cm} \log{|\Delta|} < (\log{N})^2. $$ Next, the sum of the logarithms of the primes in the interval $\big( (\log{N})^2, 3(\log{N})^2 \big)$ is $2(\log{N})^2 + o((\log{N})^2)$, so it is certainly $> (\log{N})^2$ for $N \gg 0$ big enough. Thus, by (2), it is easy to see that the interval $\big( (\log{N})^2, 3(\log{N})^2 \big)$ contains a prime $\ell$ which divides neither $|\Delta|$ nor any of the exponents $\alpha = \mathrm{ord}_p(\Delta)$ in the prime factorization $|\Delta| = \prod p^{\alpha}$ of $|\Delta|$.
Consider now the pair $(E,\ell)$: it has $F_{\mathrm{mod}} = \mathbb{Q}$, and since $E$ has rational $2$-torsion, $F_{\mathrm{tpd}} = \mathbb{Q}$ as well. Let $F := \mathbb{Q} \big( \sqrt{-1}\big)$. I claim that, upon taking $N$ big enough, the pair $(E_F,\ell)$ arises from an initial $\Theta$-datum as in IUTT-I, Definition 3.1. Indeed:
- Certainly (a), (e), (f) of IUTT-I, Def. 3.1 are satisfied (with appropriate $\underline{\mathbb{V}}, \, \underline{\epsilon}$);
- (b) of IUTT-I, Def. 3.1 is satisfied since by construction $E$ is semistable over $\mathbb{Q}$;
- (c) of IUTT-I, Def. 3.1 is satisfied, in view of (B) above and the choice of $\ell$, as soon as $N \gg 0$ is big enough (recall that $\ell > (\log{N})^2$ by construction!), and by the observation that, for $v$ a place of $F = \mathbb{Q}(\sqrt{-1})$, the order of the $v$-adic $q$-parameter of $E$ equals $\mathrm{ord}_v (\Delta)$, which equals $\mathrm{ord}_p(\Delta)$ for $v \mid p > 2$, and $2\cdot\mathrm{ord}_2(\Delta)$ for $v \mid 2$;
while $\mathbb{V}_{\mathrm{mod}}^{\mathrm{bad}}$ consists of the primes dividing $\Delta$;
- Finally, (d) of IUTT-I, Def. 3.1 is satisfied upon excluding at most four of Masser's examples $E$. (See page 37 of IUTT-IV).
Now, take $\epsilon := \big( \log{N} \big)^{-2}$ in Theorem 1.10 of IUTT-IV; this is certainly permissible for $N \gg 0$ large enough. I claim that the conclusion of Theorem 1.10 contradicts (1) as soon as $N \gg 0$ is large enough.
For note that Mochizuki's quantity $\log(\mathfrak{q})$ is precisely $\log{|\Delta|}$ (reference: see e.g. Szpiro's article in the Grothendieck Festschrift, vol. 3); his $\log{(\mathfrak{d}^{\mathrm{tpd}})}$ is zero; his $d_{\mathrm{mod}}$ is $1$; and his $\log{(\mathfrak{f}^{\mathrm{tpd}})}$ is our $\log{N}$. By construction, our choice $\epsilon := \big( \log{N} \big)^{-2}$ then makes $1/\ell < \epsilon$ and $\ell < 3/\epsilon$, whence the finaly display of Theorem 1.10 would yield $$ \frac{1}{6} \log{|\Delta|} \leq (1+29\epsilon) \cdot \log{N} + 2\log{(3\epsilon^{-8})} < \log{N} + 16\log{\log{N}} + 32, $$ where we have used $\epsilon \log{N} = (\log{N})^{-1} < 1$ for $N > 3$, and $2\log{3} < 3$.
The last display contradicts (1) as soon as $N \gg 0$ is big enough.
Thus Masser's examples yield infinitely many counterexamples to Theorem 1.10 of IUTT-IV (as presently written).
Added on 10/15, and revised 10/20. Mochizuki has commented on the apparent contradiction between Masser's examples and Theorem 1.10:
http://www.kurims.kyoto-u.ac.jp/~motizuki/Inter-universal%20Teichmuller%20Theory%20IV%20(comments).pdf
He writes that he will revise portions of IUTT-III and IUTT-IV, and will make them available in the near future. (He estimates January 2013 to be a reasonable period). He confirms the following ["essentially"] anticipated revision of Theorem 1.10:
Let $E/\mathbb{Q}$ be a semistable elliptic curve with [say, for the sake of simplifying] rational $2$-torsion [i.e., a Frey-Hellegouarch curve] of minimal discriminant $\Delta$ and conductor $N$ (square-free). For $\epsilon > 0$, let $N_{\epsilon} := \prod_{p \mid N, p < \epsilon^{-1}} p$. Then: $$ \frac{1}{6} \log{|\Delta|} < \big( 1 + \epsilon \big) \log{N} + \Big( \omega(N_{\epsilon}) \cdot \log{(1/\epsilon)} - \log{N_{\epsilon}} \Big) + O\big( \log{(1/\epsilon)} \big) $$ $$ < \log{N} + \Big( \epsilon \log{N} + \big( \epsilon \log{(1/\epsilon)} \big)^{-1} \Big) + o\Big( \big( \epsilon \log{(1/\epsilon)} \big)^{-1} \Big), $$ where $\omega(\cdot)$ denotes "number of prime factors." The second estimate comes from the prime number theorem in the form $\pi(t) = t/\log{t} + t/(\log{t})^2 + o\big( t/(\log{t})^2 \big)$, applied to $t := \epsilon^{-1}$, and is sharp if you restrict $\epsilon$ to the range $\epsilon^{-1} < (\log{N})^{\xi}$ with $\xi < 1$, as there nothing prevents $N$ from being divisible by all primes $p < (\log{N})^{\xi}$. In particular, as the Erdos-Stewart-Tijdeman-Masser construction is based on the pigeonhole principle, which cannot preclude that $N$ be divisible by all the primes $< (\log{N})^{2/3}$, the second estimate could very well be sharp in all the Masser examples. As it is easily seen that the bracketed term exceeds the range $\sqrt{\log{N}}/(\log{\log{N}})$ of Masser's examples, this has the implication that
the Erdos-Stewart-Tijdeman-Masser method cannot disprove Mochizuki's revised inequality,
which therefore seems reasonable.
On the other hand, if we take $\epsilon := (\log{N})^{-1}$ and assume $\omega(N_{\epsilon})$ bounded, this would yield $(1/6)\log{|\Delta|} < \log{N} + O(\log{\log{N}})$, just as before. (Thus, Mochizuki predicts that this last bound must hold for $N$ a large enough square-free integer such that the number of primes $< \log{N}$ dividing $N$ is bounded. I cannot see evidence neither for nor against this at the moment: again, the Masser and Erdos-Stewart-Tijdeman constructions are based on the pigeonhole principle, and do not seem to be able to exclude the small primes $< \log{N}$. So here we have an open problem by which one could probe Mochizuki's revised inequality. A reminder: in terms of the $abc$-triple, $\Delta$ is essentially $(abc)^2$, and $N = \mathrm{rad}(abc)$).
A side remark: note that the inverse $1/\ell$ of the prime level from the de Rham-Etale correspondence $(E^{\dagger}, < \ell) \leftrightarrow E[\ell]$ in Mochizuki's "Hodge-Arakelov theory" ultimately figures as the $\epsilon$ in the ABC conjecture.
[I have deleted the remainder of the 10/15 Addendum, since it is now obsolete after Mochizuki's revised comments. ]