Monday, December 04, 2006

Super Type Tokens

When we added generics to Java in JDK5, I changed the class java.lang.Class to become a generic type. For example, the type of String.class is now Class<String>. Gilad Bracha coined the term type tokens for this. My intent was to enable a particular style of API, which Joshua Bloch calls the THC, or Typesafe Heterogenous Container pattern. For some examples of where this is used see the APIs for annotations:

public <A extends Annotation> A java.lang.Class.getAnnotation(Class<A> annotationClass)

My earliest use of this feature (like my earliest use of all recent Java language features) appears in the compiler for the Java programming language (javac), in this case as a utility called Context, and you can find the code in the open source version. It was a utility that allowed the compiler to be written as a bunch of separate classes that all refer to each other, and solved the hard problem of getting all the parts created in an order such that they can be initialized with references to each other. The utility is also used to replace pieces of the compiler, for example to make related tools like javadoc and apt, the Annotation Processing Tool, and for testing. Today I would describe the utility as a simple dependency injection framework, but that wasn't a popular buzzword at the time.

Here is a simple but complete example of an API that uses type tokens in the THC pattern, from Josh's 2006 JavaOne talk:

public class Favorites {
    private Map<Class<?>, Object> favorites =
        new HashMap<Class<?>, Object>();
    public <T> void setFavorite(Class<T> klass, T thing) {
        favorites.put(klass, thing);
    }
    public <T> T getFavorite(Class<T> klass) {
        return klass.cast(favorites.get(klass));
    }
    public static void main(String[] args) {
        Favorites f = new Favorites();
        f.setFavorite(String.class, "Java");
        f.setFavorite(Integer.class, 0xcafebabe);
        String s = f.getFavorite(String.class);
        int i = f.getFavorite(Integer.class);
    }
}

A Favorites object acts as a typesafe map from type tokens to instances of the type. The main program in this snippet adds a favorite String and a favorite Integer, which are later taken out. The interesting thing about this pattern is that a single Favorites object can be used to hold things of many (i.e. heterogenous) types but in a typesafe way, in contrast to the usual kind of map in which the values are all of the same static type (i.e. homogenous). When you get your favorite String, it is of type String and you don't have to cast it.

There is a limitation to this pattern. Erasure rears its ugly head:

Favorites:15: illegal start of expression
f.setFavorite(List<String>.class, Collections.emptyList());
                          ^

You can't add your favorite List<String> to a Favorites because you simply can't make a type token for a generic type. This design limitation is one that a number of people have been running into lately, most recently Ted Neward. "Crazy" Bob Lee also asked me how to solve a related problem in a dependency injection framework he is developing. The short answer is that you can't do it using type tokens.

On Friday I realized you can solve these problems without using type tokens at all, using a library. I wish I had realized this three years ago; perhaps there was no need to put support for type tokens directly in the language. I call the new idea super type tokens. In its simplest form it looks like this:

public abstract class TypeReference<T> {}

The abstract qualifier is intentional. It forces clients to subclass this in order to create a new instance of TypeReference. You make a super type token for List<String> like this:

TypeReference<List<String>> x = new TypeReference<List<String>>() {};

Not quite as convenient as writing List<String>.class, but this isn't too bad. It turns out that you can use a super type token to do nearly everything you can do with a type token, and more. The object that is created on the right-hand-side is an anonymous class, and using reflection you can get its interface type, including generic type parameters. Josh calls this pattern "Gafter's Gadget". Bob Lee elaborated on this idea as follows:

import java.lang.reflect.Constructor;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.ParameterizedType;
import java.lang.reflect.Type;
import java.util.ArrayList;
import java.util.List;

/**
 * References a generic type.
 *
 * @author crazybob@google.com (Bob Lee)
 */
public abstract class TypeReference<T> {

    private final Type type;
    private volatile Constructor<?> constructor;

    protected TypeReference() {
        Type superclass = getClass().getGenericSuperclass();
        if (superclass instanceof Class) {
            throw new RuntimeException("Missing type parameter.");
        }
        this.type = ((ParameterizedType) superclass).getActualTypeArguments()[0];
    }

    /**
     * Instantiates a new instance of {@code T} using the default, no-arg
     * constructor.
     */
    @SuppressWarnings("unchecked")
    public T newInstance()
            throws NoSuchMethodException, IllegalAccessException,
                   InvocationTargetException, InstantiationException {
        if (constructor == null) {
            Class<?> rawType = type instanceof Class<?>
                ? (Class<?>) type
                : (Class<?>) ((ParameterizedType) type).getRawType();
            constructor = rawType.getConstructor();
        }
        return (T) constructor.newInstance();
    }

    /**
     * Gets the referenced type.
     */
    public Type getType() {
        return this.type;
    }

    public static void main(String[] args) throws Exception {
        List<String> l1 = new TypeReference<ArrayList<String>>() {}.newInstance();
        List l2 = new TypeReference<ArrayList>() {}.newInstance();
    }
}

This pattern can be used to solve Ted Neward's problem, and most problems where you would otherwise use type tokens but you need to support generic types as well as reifiable types. Although this isn't much more than a generic factory interface, the automatic hook into the rich generic reflection system is more than you can get with simple class literals. With a few more bells and whistles (toString, hashCode, equals, etc) I think this is a worthy candidate for inclusion in the JDK.

21 comments:

Anonymous said...

How about casting? I've had a few uses for the following method:

    @SuppressWarnings("unchecked")
    public static <E> List<E> castList(Class<E> elementClass, Object obj) {
        if (obj == null) return null;
        List<E> list = (List<E>) obj;
        for(E element : list) {
            if (! elementClass.isInstance(element)) {
                throw new ClassCastException(String.format(
                        "Element '%s' is instance of %s. Expected: %s",
                        element, element.getClass(), elementClass));
            }
        }
        return list;
    }

BTW: How does one properly typeset code snippets here? <pre> does not work...

Neal Gafter said...

axel: your castList method is not type safe. If I create a new ArrayList of Integers, I can use your method to cast it to a List of String and put strings into what is supposed to be a list of integers.

Anonymous said...

It's cool but some cases are unsafe.
I can create a parametrized type by an unbounded wilcard which is not reifiable.

ArrayList<?> instance=
new TypeReference<ArrayList<?>>(){}.
newInstance();

You have to do more check in the constructor.
Futhermore this line generates a
CCE in the constructor :
ArrayList<String>[] instance = new TypeReference<ArrayList<String>[]>(){}.
newInstance();

cheers,
RĂ©mi

Anonymous said...

I'm not completely following, could someone rewrite the Favorites class utilizing the TypeReference class? Thanks.

Unknown said...

I know this is probably the sort of thing that is reinvented all the time, but I thought I should mention that I saw this kind of technique before, on this blog post. It is in portuguese - the authors are instructors at a brazillian Java training company.

Anonymous said...

A little bit OT, but anyway...

Regarding the Context and the Key<T>, I've defined something similar - like a map, but whether the key used specifies the type of the value using generics, so that you can access the values without having to cast.

It applies to particular sorts of lookup, where you have keys defined as constants, and all the values are of different types.

It would be good if interfaces/classes for this were added to the JSE, as this is a pretty common usage of map, and currently all you can do is use Map<Object,Object> and do some casting.

Shakeel Mahate said...

How about making the TypeReference.newInstance() more robust by allowing Object ... initargs.

Currently you only allow zero-arg constructor.

Anonymous said...

Using TypeReference in the Favorites examples would be something like:


public class Test {

public static void main(String[] args) {
ArrayList<String> list = new ArrayList<String>();
list.add("Just me");

Favorites f = new Favorites();
f.setFavorite1(new TypeReference<ArrayList<String>>() {
}, list);

f.setFavorite1(new TypeReference<String>() {
}, "Hello");

List<String> newList = f.getFavorite1(new TypeReference<ArrayList<String>>() {
});

String str = f.getFavorite1(new TypeReference<String>() {
});

System.out.println(str);

System.out.println(newList);
}

}

class Favorites {
private Map<TypeReference<?>, Object> favorites1 = new HashMap<TypeReference<?>, Object>();

public <T> void setFavorite1(TypeReference<T> typeRef, T thing) {
favorites1.put(typeRef, thing);
}

@SuppressWarnings("unchecked")
public <T> T getFavorite1(TypeReference<T> typeRef) {
return (T)favorites1.get(typeRef);
}

}

abstract class TypeReference<T> {

private final Type type;
private volatile Constructor<?> constructor;

protected TypeReference() {
Type superclass = getClass().getGenericSuperclass();
if (superclass instanceof Class) {
throw new RuntimeException("Missing type parameter.");
}
this.type = ( (ParameterizedType)superclass ).getActualTypeArguments()[0];
}

public Type getType() {
return this.type;
}

public boolean equals(Object o) {
return ( o instanceof TypeReference ) ? ( (TypeReference)o ).type.equals(this.type)
: false;
}

public int hashCode() {
return this.type.hashCode();
}
}


can you find a case where this code in not type safe?

Anonymous said...

I just think the applications we dream up to extend the language are interesting, to say the least. Simply written code is much easier to maintain. Type safe code is great because the compiler is doing all the work while he coder thinks about the application. As the language evolves, we should make sure that we don't make the langauge so cryptic that it becomes difficult to maintain. (One of the downfalls of C++ ?) Why do we continue to push the language to become more cryptic when the existing language is fine in most cases? I like the type safe checking that Tiger added with generics, but let's not abuse it so that it appears like a new language. To me, it is a sign that coders are missing the point about the profession. Build applications that people need. Create JSR extensions that add value to make the Java language more rich so more people use it. The best ideas in life are simple. Giving coders the ability to abuse the language will cause a maintenance nightmare when the existing coders leave their positions and move on.

Speaking from my professional experience in costing, requirements analysis, design, coding, testing, deployment and maintenance, it bothers me we I see Java code that overuses Reflection. Reflection is bad. It should be limited to code that is used in IDEs. Instanceof are a sign that code is not correctly modelled. Creating a new instance of an object using Reflection makes things difficult to follow in a debugger the last time I checked. Another bad source of code is the overuse of XML. Coders need to write the serialization code and perform all sorts of validation. The help of JAXB makes things easier, but how much more easier is it to change Java code in an IDE than change XML in an IDE? I bet changing Java code is easier because code completion works in almost every Java IDE, while XML tag completion does not always exist. A schema must be around, which requires more effort. And another unfortunate experience I had while maintaining someone else's code was this; the abuse of HashMaps to model all your system data which I see in the above code. This makes the application difficult to debug. Back then in JDK 1.4 casting was everwhere to get the data out. Why not just create a proper model using a POJO? It makes more sense and becomes easier to read for the maintainer. If you need to debug the application to see who is setting a particular property, set a breakpoint in the debugger or throw in a Thread.dumpStack() in a runtime version and voila, you know who is calling you and how many times you are called. Overuse of the HashMap to store all your data is a nightmare to maintain. It should only be used for quick lookups of similar types that can be enforced with generics. I see code above with instanceof's. This is asking for trouble. If a new type is added during maintainence, you will still have to change the code. Making code overly generic is bad. It should only be done when there is a high instance count of objects being used or the bulk of you system can be modelled in a couple hundred lines of code. Then it is worth it.

In my mind, overextending the langauge to account for some slickness is just putting rope out there and waiting for people to hang themselves.

Write code, keep it simple, focus on brainstorming for new ideas and turning those ideas into solutions or products to make the world a better place. If a technical person who is a non-coder can pick up some code and understand it, give the language +1. Adding extreme complexity to the language will cause entrepreneurs to change to another langauge, maybe C#?

Brian Oxley said...

Rather than check for a type parameter in the constructor, you can make it a syntax error to leave out:

public abstract TypeReference<R> implements Comparable<TypeReference<R>> {
// ...

public int compareTo(TypeReference<R> o) {
// Need a real implementation.
// Only saying "return 0" for illustration.
return 0;
}
}

Now this is legal:

new TypeReference<Object>() { };

But this is a syntax error for not implementing Comparable:

new TypeReference() { };

Anonymous said...

I've been looking at and using this for a while but I still can't convince myself of why this works. Particularly why does extension provide the magic?

Why does getGenericSuperClass() have the actual parameterized type, why wasn't it erased? Doesn't it have to be in the bytecode? If getGenericSuperClass() has the parameterized type, why doesn't the JDK supply a getGenericClass() method?

Thanks in advance,


Confused

Neal Gafter said...

The STATIC types are recorded in the bytecodes, but not the dynamic types. This works whenever the two are the same. The point of this post is to illustrate how they can differ.

Albert Strasheim said...

I'm still trying to figure out figure out this TypeReference (now called TypeLiteral in Google Guice) business.

Say I have something like:

class Foo<T>{}

is there a way for me to get at the equivalent of T.class inside a method of Foo using this gadget?

Thanks.

Anonymous said...

There's a little catch with these type tokens, namely when the type argument contains type variables. Consider:

    public static <T> TypeToken<List<T>> makeListTypeToken()
    {
        return new TypeToken<List<T>>() { };
    }

    TypeToken<List<String>> ls = makeListTypeToken();
    TypeToken<List<Number>> ln = makeListTypeToken();
    TypeToken<List<Number>> ln2 = makeListTypeToken();
    System.out.println(ls.equals(ln));
    System.out.println(ln.equals(ln2));

All TypeToken instances generated by makeListTypeToken() actually represent the type expression "List<T>" with the type variable T. Hence the above either prints "true" twice, or prints "false" twice (the Guice implementation mostly does the latter, except in certain inner class cases where it does the former), or throws an exception.

The underlying issue is that a type expression containing type variables doesn't actually denote a type (but, well, a type expression), similar to how a lambda-calculus term with free variables doesn't denote a value.

The TypeToken constructor therefore has to check its type argument and throw an exception if it contains a type variable. Of course this means that you get a run-time error for something that should be a compile-time error.

Anonymous said...

I've implemented both TypeReference and TypeToken. Both work fine. But I found them inconvenient:

TypeReference can be applied only when your class does not extend anything, so there's room for extending from TypeReference. This requirement drastically reduces the applicability of TypeReference.

TypeToken (I prefer to call it SuperTypeToken) can be applied when a class extends another class which has generic type parameters. It obliges me to create [possibly] an abstract class, declare type parameters in it and then extend it, creating the class I'm really interested on since the beginning. This approach obliges me to have additional classes in my object model which I simply shouldn't have conceptually.

Why Class has getGenericSuperClass() but not something similar like getGenericClass() to obtain a Type reference to the class itself?

Is there a way to circunvent this problem? I was not able to obtain the actual classes passed as generic parameters. :(

Thanks

Richard Gomes said...

I've written an article which contains a proof of concept, support classes and a test class which demonstrates how everything works together.

http://www.jquantlib.org/index.php/Using_TypeTokens_to_retrieve_generic_parameters

Thanks

Richard Gomes

Anonymous said...

Is there a reason that the class needs to be abstract? Would the same still work with a concrete class that is extended?

- Saish

Neal Gafter said...

@Saish: Yes, it would still work if the class were no abstract, but that invites the programmer to err by forgetting to extend it.

Derek P. Moore said...

Based on a description of this technique being used in jquantlib, I was able to solve for the non-anonymous case.

Assuming we're somewhere inside "class Test extends Tester" given a "class Tester"):

Type superclass = this.getClass().getGenericSuperclass();
Type firstT = ((ParameterizedType) superclass).getActualTypeArguments()[0];
String nameT = ((Class) firstT).getName();
Class typeT = (Class) Class.forName(nameT);
T instance = typeT.newInstance();

Porting this solution to work within a single "class Test" should be trivial, and is left as a exercise for the reader.

wheleph said...

I don't get why the trick with implementing Comparable forces programmer to provide the type parameter. If he/she doesn't provide it then the compiler just deals with raw TypeRef and provides compareTo(Object o) for it. Doesn't it?

I tried this example in JDK 6/7/8 and in all cases this code new TypeRef() {} is compiled without errors. Am I missing something?

Idea factory worker said...

@Volodymyr
Not sure, maybe not providing a type parameter defaults to "Object"?