Docsystem internals

Background: Julia type system

The type system consists of the following concepts:

  • Concrete types, that can actually be instantiated.
  • Unions of concrete types. Subtype-of relation (<: operator) is just a subset relation, really.
    • Int <: Union{Int64,Float64}
    • You can also form unions of abstract and union types. The type system flattens these things:
      julia> Union{Union{Int,Float64}, AbstractString, Integer}
      Union{Float64, AbstractString, Integer}
  • Abstract types: the form a type hierarchy, with Any on the top. They function as a unions of all their subtypes.
    • They do form disjoint sets though (sets are either contained in each other, or disjoint).
    • It does seem though that the type system is wary of the future: Signed <: Union{subtypes(Signed)...} does not hold. This is likely because in the future, another subtype may be introduced to Signed, which would invalidated this relation (for a union constructed previously that is).

Then parametric types enter.

  • You can form concrete types: Vector{Int} or Vector{Any}.
  • The type parameter (T for Vector{T}) can be a single concrete type (e.g. Int) or a set of types (e.g. Integer, Union{X,Y}, Any). But Vector{T} will always be a concrete type.
  • You can then form UnionAll type unions, where you (optionally) restrict the type parameters somehow (Vector{T} where T <: Integer).

There are also tuple types.

  • Tuple types are covariant (Tuple{Int} <: Tuple{Any}).
  • A tuple type is concrete only if all it's type parameters are concrete.
  • There is also the Vararg{T,N}.. type? Which can only be the last one.
    • If N is fixed, it reduces down to the corresponding tuple:
      julia> Tuple{String,Vararg{Int,3}}
      Tuple{String,Int64,Int64,Int64}

Representing types

The whole type system can also be represented in Julia. A set of types exist for this purpose:

  • Concrete types: DataType, Union and UnionAll. Actually, there is also Core.TypeofBottom, which is a singleton for Union{} – the empty set of types.

  • They are all subtypes of Type{T}. Anything that represents a type or a union of types is a subtype of Type. Type{T} for T isa Type matches any type.

  • All Julia types can be represented by objects of one of these types.

  • DataType represents concrete types (bitstypes, structs etc.) and abstract types, as long as they are not parametric.

  • Union represents:

    • "Irreducible unions". E.g Union{Int} === Int isa DataType, but Union{Int,Float64} isa Union
  • UnionAll represents:

    • Parametric types.

    • Consists of a .body, which is another Type (often DataType, but can also be a Union or, in fact, when you have multiple type variables, it nests UnionAlls).

      julia> t = Tuple{Int,T} where T
      Tuple{Int64,T} where T
      
      julia> @show typeof(t.var) t.var;
      typeof(t.var) = TypeVar
      t.var = T
      
      julia> @show typeof(t.body) t.body;
      typeof(t.body) = DataType
      t.body = Tuple{Int64,T}
      
      julia> t = Tuple{Int,T,U} where T where U
      Tuple{Int64,T,U} where T where U
      
      julia> t.body
      Tuple{Int64,T,U} where T

Method signatures

  • Function signatures only cover positional arguments, not keyword arguments.
  • A type signature is normally a Tuple (a DataType of nameof(t) = :Tuple). Signatures of parametric methods are UnionAlls.
  • Empty signature (f()) is Tuple{}

Docstrings

All the docstrings in a module are collected into a metadata variable in that module (i.e. every module has a unique field with the name in Docs.META; Docs.META is gensymed when Base is loaded).

Every docstring is the identified by its binding and signature. A binding is effectively just a (::Module, ::Symbol) pair, but represeted with the Docs.Binding type. So, e.g. (LinearAlgebra, :eigen) is a binding for LinearAlgebra.eigen.

This metadata is a IdDict{Any,Any} and can be accessed with Docs.meta(::Module) function. The keys of the Docs.meta dictionary are the Binding objects. Each value in the dictionary is a Docs.MultiDoc object, which is itself basically a dictionary. It contains all the docstrings attached to a particular binding.

Docs.MultiDoc does not seem to really have a well-defined API. But you can access the docstrings via the .docs field, which is another IdDict, but this time the keys are the different signatures.

Docstring signatures

A single binding, in general, can have multiple docstrings attached (e.g. a single function can have multiple methods). The docsystem uses them as keys, together with a few special values:

  • For functions:
    • The function signature is the signature used to identify it (f() -> Tuple{}, f(::Int, ::AbstractString) -> Tuple{Int,AbstractString} etc.). That includes parametric methods, although there appears to be a bug with those.
    • Function declarations are attached to the Union{} "signature".
  • For a struct
    • The docstrings of the fields are stored in the main docstring (DocString object) under the :data key in .data.
    • The docstring of an inner constructor seems to be lost into the void.
    • If there is no type docstring, the field docstrings are also lost into the void.

Union{} is generally the signature for the "highest level" docstring:

  • For a function declaration function foo end
  • For a type declaration: struct Foo ... end
  • Also for modules, constants, abstract types

Note that Tuple{Vararg{Any,N} where N} === Tuple, so the signature of f(xs...) comes up as just Tuple.

Docsystem bugs

Accessing docstrings of a submodule that has the same name:

"module Foo"
module Foo
    "module Foo.Foo"
    module Foo
        ""
        f() = 1
    end
    g() = 2
end
julia> Foo
Main.Foo

help?> Foo
module Foo

julia> Foo.Foo
Main.Foo.Foo

help?> Foo.Foo
module Foo

help?> Foo.Foo.Foo
module Foo.Foo

Inner constructors, fields

  • Inner constructor docstrings are lost into the void.
  • If there is no type declaration docstring, field docstrings are also lost into the void.
  • Old issue: JuliaLang/julia#16730

Parametric methods

  • If you have a parametric type signature (e.g. foo(::T) where T <: Integer), the signature will be a Union{<normal sig>, Tuple{T}}
  • Old issue: #29437

Call syntax

Ref: https://github.com/JuliaDocs/Documenter.jl/issues/558

Just a few notes for future reference here: there is actually an upstream problem here too. The docstring storage does not distinguish between the call syntax methods and outer constructor methods:

module Foo1
    "Foo.Bar"
    struct Bar
        x :: Int
    end
    "constructor"
    Bar(::String) = Bar(0)
    "call syntax"
    (::Bar)(::Array) = 1
end
julia> Docs.meta(Foo1)[Docs.Binding(Foo1,:Bar)].docs
IdDict{Any,Any} with 3 entries:
  Union{}       => DocStr(svec("Foo.Bar"), nothing, Dict{Symbol,Any}(:typesig=>Union{},:module=>Main.Foo1,:linenumb…
  Tuple{String} => DocStr(svec("constructor"), nothing, Dict{Symbol,Any}(:typesig=>Tuple{String},:module=>Main.Foo1…
  Tuple{Array}  => DocStr(svec("call syntax"), nothing, Dict{Symbol,Any}(:typesig=>Tuple{Array},:module=>Main.Foo1,…

In fact, you can see that it gets overwritten:

module Foo2
    "Foo.Bar"
    struct Bar
        x :: Int
    end
    "constructor"
    Bar(::String) = Bar(0)
    "call syntax"
    (::Bar)(::String) = 1
end
┌ Warning: Replacing docs for `Main.Foo2.Bar :: Tuple{String}` in module `Main.Foo2`
└ @ Base.Docs docs/Docs.jl:223

julia> Docs.meta(Foo2)[Docs.Binding(Foo2,:Bar)].docs
IdDict{Any,Any} with 2 entries:
  Union{}       => DocStr(svec("Foo.Bar"), nothing, Dict{Symbol,Any}(:typesig=>Union{},:module=>Main.Foo2,:linenumb…
  Tuple{String} => DocStr(svec("call syntax"), nothing, Dict{Symbol,Any}(:typesig=>Tuple{String},:module=>Main.Foo2…