There are several kinds of universes that one can consider in type theory, which vary along many different axes.
A universe a la Russell is a type whose terms are types. In the presence of a separate judgment? “$A type$”, this can be formulated as a rule
Thus, the type formers have rules saying which universe they belong to, such as:
With universes a la Russell, we can also omit the judgment “$A type$” and replace it everywhere by a judgment that A is a term of some universe. This is the approach taken by the HoTT book and by Coq?.
A universe a la Tarski is a type together with an “interpretation” operation allowing us to regard its terms as types (or “codes for types”) (see Luo 12). Thus we have a rule such as
We usually also have operations on the universe corresponding to (but not identical to) type formers, such as
with an equality $El(pi(A,B))=\Pi \, El(A)\, El(B)$. Usually this latter equality (and those for other type formers) is a judgmental equality. If it is only an equivalence (i.e. we have a rule which gives us a canonical term of the equivalence type), we may speak of a weakly a la Tarski universe — this case is not traditionally considered in the literature but arises more naturally in homotopical models; see model of type theory in an (infinity,1)-topos and Gallozzi 14.
We can give a slightly different definition of weakly à la Tarski universe using propositional equality? and a larger universe. More precisely, we can consider two (or many) universes $U$ and $U'$ with the usual rules for the relative reflection $el(a):U'$ for any $a:U$, a choice of weakly or strongly a la Tarski computational rules for the reflections $El$ and $El'$, and a computation rule for the relative reflection el of $U$ inside $U'$ based on propositional equality, which gives us canonical elements of the identity types $Id_{U'}(\pi'(el(a),el(b)),el(\pi(a,b)))$ and similarly for the other type formers.
If the containing universe is univalent the two definitions turn out to coincide.
Universes defined internally via induction-recursion are (strongly) a la Tarski. Weakly a la Tarski universes are easier to obtain in semantics; see model of type theory in an (infinity,1)-topos. They are somewhat more annoying to use, but probably suffice for most purposes.
Universe polymorphism means that you don’t have to do things separately for each universe level; you can do it once “parametrized” over universes and then instantiate it to any particular universe.
Typical ambiguity means you don’t have to explicitly say what universe level you’re working at; you just say something like “Type” and the system (or the reader) guesses it.
They are discussed further in this blog post.
Both Russell and Tarski style universes can be polymorphic or not, and typically ambiguous or not.
A tower of universes is cumulative if $A:U_i$ implies $A:U_{i+1}$ (rather than, say, $Lift(A):U_{i+1}$).
Cumulative Russell universes have some issues; see for instance Luo 12.
Coq? uses Russell style universes with typical ambiguity. It used to have a sort of “fake” universe polymorphism, but recent versions of Coq have real universe polymorphism (thanks to Mathieu Sozeau). For practical purposes, it also has cumulativity, although there is some question (perhaps mainly semantic) of whether this is true internally or whether it uses casts that are simply hidden from the user.
Agda uses non-cumulative Russell style universes with universe polymorphism but without typical ambiguity: you always have to explicitly indicate the universe level.
The HoTT Book (first edition) uses cumulative Russell style universes with universe polymorphism and (except for a few places) typical ambiguity.
More references and discussion can be found at model of type theory in an (infinity,1)-topos.
Cesare Gallozzi, Constructive Set Theory from a Weak Tarski Universe, MSc thesis (2014) (arXiv:1411.5591)
Zhaohui Luo, Notes on Universes in Type Theory, 2012 (pdf)
Tarski style universes with subtyping: