[Economics] Why are cost functions often assumed to be convex in microeconomics?

Solution 1:

There are several reasons:

  1. Didactic Reasons: Other users seem to have missed it but in your question you specify you are talking about "(introductory) microeconomics" [emphasis mine].

    Well the most prosaic answer is simply that it is much easier to solve cost minimization, or various other models when costs are assumed to be convex.

    This in itself is sufficient reason to construct problems with convex cost functions in introductory microeconomic courses. Demand and supply are not linear, yet in most textbooks and introductory problem they will be assumed to be linear. In addition, in real life demand can be sometimes even upward sloping if a good is a Giffen good, and supply can actually be downward sloping (e.g. some labor supply in some special cases depending on people's preference between consumption and leisure). Yet introductory textbooks typically show downward sloping demand and upward sloping supply (e.g. see Mankiw Principles of Economics that discusses these concepts but only briefly, or more narrowly micro introductory books such as Frank Microeconomics & Behavior).

    This is to a great degree for didactic reasons. It is much better for students to first master basics with simple models and when it comes to learning about costs having nicely behaved convex cost functions with single minimum makes learning easier than having to teach cost minimization with concave cost curves. Hence, even if empirically most cost curves would be concave not convex it would be very bad teaching practice to start with concave functions (or just go for full blown realism where cost functions might be piecewise, have different concavity/convexity at different points, be ill defined somewhere etc).

  2. Because of Decreasing Returns to Scale - This was covered in great detail by Bayesian, but let me add more arguments and also rebuff some of your arguments in the question.

    First, it is not unreasonable to assume that costs are convex in a long-run. In a world of scarcity firm cannot forever increase its demand for factors of production without affecting costs of these factors or inputs as well, their prices will rise eventually (ceteris paribus). We have crystal clear evidence that wages rise in tight labor markets, or that generally speaking shift in demand to the right (ceteris paribus) rises prices. You argue that in perfect competition models firms are assumed to be small, but that is not a good argument in this case. This is because firms are assumed to be too small in terms of their output being able to affect market price of their output so price of output can be taken as given (See Frank Microeconomics and Behavior pp 337). Perfect competition does not require price of inputs to be taken as given. In fact, firm might operate on perfectly competitive market while facing just monopolistically competitive factor market (where the firm is consumer not producer).

    Next, you argue that thanks to fixed costs one firms could just continuously invest in a new factories, but this argument should be false. A fix cost by definition cannot vary with output. If firm increases output by building new factory, the cost of factory ceases to be fixed costs. In fact fixed costs primarily exist in short-run as in a long-run most costs are variable (see Mankiw Principles of economics pp 260). In a long-run as you try to build more and more factories you run into the same problems of scarcity of land, capital and labor and thus bid up their prices. In fact this is nicely visualized and explained in the Mankiw textbook with the picture below:

enter image description here

Empirically, we observe that many industries have decreasing returns to scale (although constant returns to scale are common as well), and increasing returns to scale are rare (although not completely uncommon). See for example: Basu & Fernald, 1997; Gao & Kehrig 2017.

Introductory texts by their nature will not deal with specific cases but more general ones. Most introductory textbooks again do not spend too much time on Giffen goods not just because modeling them would be difficult for 101 students but also because they are not very often seen (although, I am not claiming non-convex cost functions are as rare as Giffen goods).

  1. On the Aesthetics: I think Giskard raises a valid point that there are probably many economists who assume convex costs just for mathematical elegance. However:

    • I think Giskard slightly exaggerates the problem and is bit too cynical about it. For sure there are economists who value mathematical elegance uber ales, but there is increasing trend in share of empirical papers (see Angrist et al 2017), even in microeconomics, and I think that a reasonable non-cynical explanation for the small share of micro empirical papers is that until very recently there was always lack of good micro data (in addition this is also due to breakdown, you can see the share of industrial organization empirical papers (that also heavily use cost functions) is quite high).

    • Empirically, most industries do not exhibit increasing returns to scale. While non-convex functions are definitely real (especially along some points of cost curve), empirical evidence does show that decreasing returns to scale (although constant returns to scale as well) are quite common (e.g. see Basu & Fernald, 1997; Gao & Kehrig 2017), but I think Giskard has definitely valid point that some modelers will ignore empirics for sake of mathematical elegance.

    • Lastly, but not least, I think mathematical elegance can explain why such assumption features heavily in some published theoretical work, I don't think it can explain why it is featured in introductory micro texts. Is really quadratic cost function $c=q^2$ mathematically elegant? I don't think so but that is probably the most commonly used cost function you will ever see in intro texts.

Regarding the Varian quote. Varian on page 67 states that he will first cover situation with fixed factor costs and later move to variable factor costs. Hence, unless I am misreading Varian I think the statement on the page 68 is made under assumption of constant factor prices. However, the explanation above by Mankiw does not assume that.

Solution 2:

Theoretically, the cost function is a result of a cost minimization problem with a given production technology. Convex/linear/concave costs are a result of decreasing/constant/increasing returns to scale. The thinking behind convex costs is the idea of decreasing marginal product of your input goods for production.

As an example for the kind of thinking behind a convex cost function: If you want to produce one widget, you can do it with the 3 most skilled workers in town. If you want to produce two widgets, you can do it with the 7 most skilled workers in town because the 4 additional ones are slower. Alternatively, all workers have the same productivity but first you take the cheapest and the next ones would only do the job for more money. Alternatively, you consider hours worked: A worker can produce one good in three hours, but the second one takes four hours because working is exhausting. Similarly, the first hours are paid on regular contract wherease extra-hours need extra compensation. Alternatively, you need wood for production. For the first goods you can chop wood in your own forest, but once you need more you need to find additional more expensive sources. And so on.

Remember that the supply curve is the increasing part of the marginal cost curve. The supply curve in Econ 101 is upward sloping because of the above intuition. It might be that there are increasing returns to scale, e.g. because workers can divide jobs and there are gains from specialization. Eventually, however, we assume that those gains come to an end at some point as marginal returns diminish.

Next, and this is not a good reason, note that a monopolist optimally produces a quantity such that marginal revenue - marginal cost = 0. Convex costs ensure that this is in fact a maximum.

Solution 3:

If the cost function is globally concave in output $y$, then

  • the profit function is convex in $y$ and the optimal (profit maximizing) output is not characterized by the equality between price and marginal cost, so price taker firms have an optimal output level that is either 0 or tends to infinity
  • the profit is negative at least for low levels of output (if $c'(0)>p$)

Such a concavity assumption will have difficulties to explain why about 60% of firms produce less than 5% of total output.
For these reasons, the cost functions are probably not concave (globally), unless for firms with strong market power... Instead it is quite plausible that the cost function is locally convex and exhibits nonconvexities here and there.

Solution 4:

Increasing and convex costs are a result of decreasing returns to scale. These are mainly due to the limited availability of (local) input factors. Other contributing factors are the decline of management efficiency of large-scale production, the imperfection of internal supervision and control mechanisms, and more complex information transmission.

Solution 5:

In my very limited empirical experience it seemed that cost functions were in fact non-convex for most output levels in the few industries I looked at. Allocating costs to exact parts of a process is very difficult, but marginal costs were generally assumed to be constant, with some jumps as capacity constraints were reached.

The theorems in microeconomics/general equilibrium theory that deal with the existence of solutions to profit maximization problems, the existence of competitive equilibria and Pareto-efficiency of these equilibria are well-liked for their mathematical elegance. However they rely on a bunch of convexity/concavity assumptions. (The branch of math used is convex analysis.)

Hence these assumptions are dictated more by the desire for elegant theoretical solutions rather than empirical knowledge.

Note that there are many possible rationalizations for why cost functions may be convex, some (interesting ones) are outlined in the other answers. I would argue that these are mostly rationalizations of the assumption, not proof of its empirical validity. To be fair, I also do not provide empirical proof.