GTUG – Using the Google Collections Library for Java (1 of 2)

(My Original Blog Post: -*http://api-madness.com/post/gtug-using-the-google-collections-library-for-java-1-of-2/)

Google Collections Library for Java

Attention! API Garbage (might be unreadable):

set immutable collection

Is what Josh Fox introduced Kevin and Josh introduction so that and a pretty fine on introduction is what we call a just huge time it s an important part of running a large corporation and problems that you will anyway so I have a great letter of introducing a Kevin Briley tonight who s going to be talking to you about the Google collections and as many people we know of I developed the Java collections framework 11 years ago and it seems almost unreal really is true in 1997 that I developed it and that the framework calls are free to use the interfaces that the foreign collections implementations provide either quickly or abstract implementations of those collections and algorithms that allow you to manipulate collections and you don t but the basic idea around this framework was it was extensible people could build on it that people could add to it and it would turn into this sort of ecology over the years and that has happened but it s a really blossomed with Kevin s framework and and I m delighted he tells all about that tonight not only does it build on the work for them but it does so in a fashion that I m very comfortable with that lesson ideas that have kind of been pervasive in England on a woman s minimize mutability always preferred immutable things or mutable things I ve always preferred things when you must have the ability to allow as little change as necessary to achieve their stated goals of another one is when in doubt leave it out for someone to face the big ones and a third one is know and use the library s great believer in having a small number of people very hard to produce systems that will solve your problems for you that you can like your programs without worrying about these lower-level pro and I believe that Google collections does all of these things so what we can warm welcome to Kevin Burley will cost more but Google collections from tonight will I talk about who questions why we had to correct one thing from Josh s introduction to physical dependence framework to point out that framework is the result of a lot of hard work and in particular a genuinely listening back here has done more for the Volvo it s not what you don t know who would come home for them but the very much a project driven by Jerry myself with a lot of help from a lock for a hardware friends and so I give you just the real show first we re talking about an open source Apache license library that you can go download tonight or if your laptop you can download it right now as we speak it at Google — collections.Google code.com and is currently one to work with JDK 1.5 in the future we will want to add support for Java 1.6 but when we do or don t always maintain well options are always we re going to have for some time maintain the 1.5 branch libraries while and we will probably be taking steps to make that 1.5 branch more with so those of you for using Google Web toolkit would be able to use that version of the library with what even as the full 1.6 version of the library may not completely work of with everything am right now we have been making prerelease snapshots available were about to make a release later this month which were going to call or .9 Y. recalling it oh ..the vagaries of yours that we do not want to freeze our APIs all of the signatures all of the names on the organizational files and folders with ample time for people not just inside Google but also outside the will to really get to use it and give us their feedback so if you would rather not use code that may now in a future upgrade you have to rename some of your beer calls the code then maybe a little early for you to use Google collections for that reason that some of the names may change in support of him when we ve got 1.0 API frozen and we will commit to that point in the him but from the sense of you know how it apart from the issue of API stability is a very production worthy library is used very widely in production at Google in the many dozens of Java based projects including the Google presentations of product that I m using to show you subscribe to basically what I would do this presentation is give you a sort of uneven overview of the library on some topics on a dodgy and go into more detail in another ones on this list scare right over and that s an interest of time and if it s in some topics I skip right over were drinking then you can bring those up later in the other thing I want to point out is that this presentation I don t assume that you have a pretty good familiarity with the Java collections framework in Java that until and I won t spend a lot of time explaining things that are in there the reason is not because if you don t know those things I think you re a jerk he might injure I don t know the reason is that if you are very deeply familiar with the code then you should not download or library should not be using you should spend your time first digging into detail of what s already in Java and its overture really familiar with those things that libraries the bedrock and we re the own the icing on the bedrock mix metaphors ruled so him and talk about immutable collections a couple of our new collection types which are multi-sets a multi-mouse analysts give over a few other types of implementations and on a skim over some of our static utilities that we offer and the know as the talk goes further and further word on just the relevant years of things that I m not going to and questions are welcome throughout especially if you have a question about something that no getting around getting the way of you understand what you re trying to present that would really rather became a last question and if I think that s a question best if the later is the most overlooked when the no-bid or so immutable collections in the JDK we have lots and lots of mutable collections of them have unmodifiable wrappers of musical collections.unmodifiable like they exist for set lists sorted not sorted set up a song online however there is a difference between unmodifiable and in mutable immutable as stronger form of the concept if I give you unmodifiable collection I m really referring to that reference the collection I m saying you cannot change this collection via this reference but I m not promising that some other actor might not be change in the clutch and behind-the-scenes if something is immutable it means absolutely positively will never change him and it is possible to use these unmodifiable wrappers to make something immutable all you have to do is make sure you throw away the reference directly to the immune or directly to unmodifiable instance so the only reference that survives is the unmodifiable) than it is basically immutable on but I m not sure I would ve gone much further than that this comes, I can t even begin to list all of the benefits of immutability undercoat there are in effective Java of both editions they basically linear objects are immutable than your life is easy when your objects are immutable than your life is hard so sometimes you need that mutability and so you make your life hard but if you have to make your life are why would one make your life harder so when you re immutable you get free thread safety and get free not to worry about crossing trust boundaries the only sort of concerns his life and so in our library we have several immutable collection implementations that are brand-new standalone implementations so we ll just often delegate to some other JDK class which delegates to some of the JDK classes is tightly am optimized code on that infamous lists sorted set them and I have to concede a PC works is immutable sorted mouth I hope you can t say it doesn t exist yet but whether in recognition of the fact that when they walked out of there also is pretty consistent weld so sorted first notable is the is a Java five versus Java six issue and radar libraries focus on Java 50120 okay so the JDK wrappers are still good for would ve been for so they re really good for providing unmodifiable views of data which can still be changing but if that doesn t situation than ours are better in most ways for one thing and they give you that hard immutable guaranty and it s as well as being conceptually immutable there also thread safe in mutable according to the Java nitty-gritty like a shared office with Jeremy Manson s life I asked him what would I have to do to make sure this is bona fide immutable in Java and that s all taken care of that guaranty is very valuable to their also much easier to use than the horizontal show you the and they happen to be slightly faster and use typically less memory sometimes dramatically less memory for immutable set sometimes the difference in memory usage can be on the order of 2 to 3 times so here is an example of how people typically create a constant set is one of these cases for immutable collection of some constant that you re going to use initializing of static field so first you have to declare public static final set of vinegar and the name of your constant in no hurry to fill up in a usually have two choices you can call some static method if you have a static initializer block this possibly more popular approach him and he creates a collection you have a bunch of stuff into it than you after a number to wrap it in unmodifiable set when you re done it surprising you know how many times we forget that final step of wrapping it in unmodifiable collections were actually stored in mutable collection public static final field was that it turns out you can actually do much better than this using just the JDK you could also do this would does a much better than in a little bit better so what s going on here with the Remus backwards on we are using this method are raised up as list which is a bar s method for putting a bunch of numbers and their job though the data is allocated to read them in this capacity rape stories as list which approves the collection view and in one pass and that controlling patch that which is good copy locally — that an organ call him a bubble seven result of all that so I listed that I don t even remember one thing is that you can do it all in one statement you have a static initializer block out all that stuff bores in four different classes or something kind of strange about this we really want to do something subordinate of his important classes that so wrong is what it looks like he is immutable set you will set the both of us suffer you to be a that couldn t be much simpler than that now when people read your code it says exactly what you want to say it s very direct and it is happening at all is that the performance benefits at the same time as well really really having a reference one class of Jesus and by the way this naming pattern of immutable set up both was inspired by the brilliant job until the humans that which is the first thing I m aware of that use this nifty little naming pattern would like to him was another example of maps of its amounts are even more this request new one is stored in a variable of type in mutable set instead of set I think that the answer to that will become clear in future slides with the simple answer is that you re conveying that guaranty of behavior so most of the time we re accustomed to creating implementation types like linked hash set of storing data in the field of civil types that the reason for that is that the interface types that conveys all the behavior that really mean to convey the go in this case the mutable set is both a limitation and that also behavioral guaranty so by declaring a right up front in the type of our constant and that is making that guaranty very if you are guaranteed that can t be having a stop as an IMAP example is even uglier than the set example with the promo stuff in there and rapid in the same basic thing but this time with our immutable map it is my sort of builder like syntax where you simply stay immutable map.with unusable whole bunch of lines saying all the entries that you want this map to the created with a nickel.build so this is one seems a screening of builder is filling up the builder and then it s creating a new amount winner Don with the builder are all good I couldn t really get much simpler than that for a second have ever defined so the next example so those were some constants another thing that you often want to do is make defensive copies of the collection of somebody is given to and if you re going to make defensive copy is a good idea to go and make a defensive copy into an immutable collection because if you re not planning to modify it why bother putting an immutable collection so here is an example using the existing JDK APIs we re craving a linked hash set so we can preserve the ordering of those numbers and were thrown all the numbers into a more rapid doses this is not bad with what sort of but we have to remember to do that copy on the way to a better number to rocket in unmodifiable either on the way in or on the way out we doesn t want to make sure that that happens before the reference gets passed back to:; get lucky number is not the soldiers with this looks like using immutable set so the key line is line 3 there were you lucky numbers equals in mutable set a copy of numbers and in the previous line that looked like all that stuff so now that is much simpler than a few other things have happened here as well and I conveniently boldfaced import immutable set is being used as the type of the constant on the type of the return value of this get lucky numbers method this is a really much more valuable idea that at first I thought what happens is in real life when you write a method like get lucky numbers you don t have the advantage of just being listed on a slide it off actually maintain the software and so you have to read John off that method that actually conveys what is a contractual behavior of this is a job and so the method like this one you find yourself writing things like returns a set containing in the lucky numbers and i.e. it s an immutable set really I promise I won t ever change what s in the set please believe me I really am I m pretty sure that I ve done this right in American changes I will forget another caller may or may not trust you and the caller may feel compelled to make a defensive copy of as a hearing assorted around all that messy stuff because you re just plain returning immutable set there s just no way that thing can average so the caller knows the caller knows that he doesn t have to make defensive copy as well by doing that then that sort of forces or field at the top to also be of type in mutable set so that we can return from the method and that means that we cannot accidentally puts we cannot accidentally forget that our defensive copy when the set of integers is passive and actually if we forget to make defensive copy it won t compile because we ll be trying to assign a set to a variable of type in mutable set so we renumber what OS native immutable set a copy of and is always nice when you arrange things so they are not dependent on static analysis tools are good reviews to point out these bugs do it simply by using the APIs in the normal patterns they just simply failed to comply with the yes do a lot of questions I assure you the compiler is the best static analysis tool for a severe exploiting the type system gives two things which are programs it s great I think you intended them to justice or letter of the recording tonight so the missile to go on YouTube at a later date and so if your questions just where you will run the Michael Regina go to him a few other things you may wonder if you start using immutable sets and every time you re passing something from one API to another it s constantly calling this copy of method over and over you may wonder if your server is bogged down under the weight of all these copy of operations are happening well know it won t because if you look at the specification of the copy method it gives itself permission to shoot it basically says if it can determine that the which is a very hard if the set and it is already a tight immutable set the terms of the return of a sense of instance and I must have a functionally identical to so you won t really care about that means you should own up to stop in and worried should I really make a defensive copy does not than the cost will likely do on yes do I go to Mike for a different so there are no new constructors for any of these could mean you can send them at the new boa constructor is so none of them have public constructors if they did you would be able to subclass them we actually could not make these classes themselves final because internal to their their guts they need to have their own subclasses to do something so therefore if the constructor is public you would build subclass the new build developments what every good immutable mutability guaranty so there are many different factory methods agreed with the so hopefully you shouldn t miss this constructors and the rest of the list on the legend I feel you something arsenal sorting a focus of these are your thick thinking about things like sort of set you up as we do also have an immutable sorted set clocks so if you pass a bunch of stuff to an immutable sorted set up of that sort of those things at that time and store them in the set in order to order ordered fashion centers more examples of the 7 is a sort of the five factory methods you can use it to create empty sets and singleton sets as well as an element sets now why wouldn t you just use collections.empty sets and collections.singleton does the things we ve been using for Brazil 11 years 11 years and answers you can bought as my example shows what was in mutable set of country called beta countries saw a store that abhors system and there was always a country or more that we re in the process of experiment when rolling out new forms of payment to and so forth so over time we would be editing our code in this the set of countries would shrink from five down to two and the number zero and then back up to one back up before if you use immutable set for some things and collections.empty set and collections.singleton for other things that are constantly churning back and forth we find a lot easier to discipline always use numerals above them that s always stays the same and also copy of methods that take iterators and durables so Saturday that in some representation you can throw it in there in the future I think that we re pretty much in favor of having the builders builders as well so that you can say about that everything in this collection then add the single element and remove a single element.build bigger mutable set up that suffered as a bottom is an example of grading a small map it simply looks like immutable map.O. that looks very much like immutable set of example by the parameters are alternating keys and values it s very important to realize this is not a borrower s method that s just going to allow you to pass a whole bunch of objects and then just hope that things come out right if we did that he would lose type safety the compiler would fail that static analysis in your IDE would fail at its auto completion of kinds of sad things would have so we don t we don t have large method for this we have overloads that go up to five key value pairs just like humans do them and if you want to mingle with Bigelow Avenue just use the builders and solution here so really either way you re pretty well covered under any questions floating out there haven t been too many notes I hope that on whether that s a good sign for a year over England mathematician with the socket since they re devoid of compared where to provide competitors in well we do have a lot of sponsor for the often appeals to attend a focus of him I had initially a lot of examples of immutable set I forgot to put on example of immutable sorted set basically if you want natural order it is the immutable sorted set.both the slickness if you want custom comparator looks like immutable sorted set daughter ordered by my comparator dog of an older confidence of the okay so caveats these collections do not like null elements they re hostile to the very idea of knowing they will reject any know all that they try to put an why why they do this so we get a lot of extensive research we found that the number of times that you want to know element in one of his collections is very low I say 95% plus their religious arguments today on whether that s really should be 100% or not I don t even want to go there but it s rare to want no so what happens is if you make it collection tolerant of no vendor serving those 5% of people at the expense of the other 95% 95% case would much rather they blew up immediately if you try to put a know element in this he would catch that error close right out of source and not only find it later in some totally separate piece of code luckily it were pretty free to make a decision like this because if you have no elements you have perfectly reversible workarounds is instantly keep doing what you re doing before great Ash said passing through unmodifiable set so since that s not really that bad of a fate that it frees us to focus on the dominant use case and make sure that these do exactly want to case by the JK collections are taking the same approach ever since 1.4 or so on Africa added that the Dudley quote Doug Lee is a know it is a lot of work on his collections on the Hughes is no intellect to sort of exceeds any that I am aware of in his quote on the subject is a wasn t that no socks are just so I don t number which academic publication that was in the null stocks comes when we denied I happened to and in no small Wednesday by the ravens are faster in Athens to keep argumentation simpler but that s not really what you care about most of the time or another caveat is that you can confuse yourself and others if you put mutable elements into immutable collection into an in mutable collection and pass it around and advertise it as an immutable tripping over the word as an immutable collection because people listen to the music collection is deeply immutable when really it has elements that can be changing all over the place so if you put deeply immutable objects in beloved deeply immutable object as a result if you don t know just proceed with caution so that the Whalen summarizes whole section is which is the biggest section of the talk is in the past we always would ask them does my collection needs to be immutable and if it doesn t then we would just use a plain accept unlimited knew to be immutable we would wrap out with these new collections that are part of a group with which I flip the argument under 90xb0 and that now the question is does it need to be new because it needs to be immutable God is actually an array list whatever but it doesn t give us better off using these to you for all the reasons that do the cover questions on the MoveOn to multi-suspects so it s molting multiset and multi-mounts are two of the menu types of collections of recruited and they re the ones I chose to spend the most time and energy discussing today and then in section 5 I m going to just sort of give some highlights on some of the other types so to sort of set up a conversation about multiset s office that is the type of collection so I m going to talk a little about how to select the right collection for the right job obviously anytime you have a whole bunch of foo of a similar type you want to store them in something you re looking for a collection of some kind but in order to decide what kind of collection will use you have to ask yourself a few questions in the first two of these are really the big questions that sort of determine what interface type is going to be using the third one has more to do with what implementation of the interface you don t achieve the first question is can this collection contains duplicate elements in the second question is is the ordering of my collection significant by significant what I really mean is when you consider the collection one to the question to one to be the same when you consider them what you want.equals to return true in that case we want them to map to the same value if they were used as a key in and so the third question asked newest iteration order the reason for the Saunders emphasized the fact that we talk to the ordering of elements in the collection there really are two very different things we could be talking about we could be talking about that idea whether it s significant for equality second bullet or could simply be talking about when you iterate over the elements of this collection what order things, but there s lots of different collection types use lots of different approaches that of the insertion ordered the killing patch that comparator order like a tree set of user ordered by which I mean an array list the user can just finely control exactly what order of what he or she wants those elements to be an absolute to the underworld from the beginning could be something else or it could also be lower just doesn t matter to care him so I m the folks in the first two questions because what the dedicated to those provided to implementations that cover two of the possible combinations are so here s my awesome ASCII art table I tried to use the new table drawing feature of the application tonight I was just in too much of a rush so I does I just get rid of it to good old-fashioned ASCII him Mr. possible answers to the question of whether it s an ordered board of when the order is significant on the collection of us to answer as to whether can contain duplicates if it has duplicates and has a significant order that the list if it doesn t have duplicates and doesn t have significant order that it s a set so why do with the JDK developers only fill two of the boxes in the for box windowpane oriented because lessons that just happen to cover in a 1995% of all the things that he wanted to so I m not denigrating them at all that they are really there were horses of the collection types but when you do a lot of Java development liquid of your youth occasionally realizing that something one of those to fill one of those other states and soldiers wounded would find multiset which identify we invented the idea is that you so people probably not multiset farther also called bags of aware of the code and that something that can have duplicates but is insignificant ordering is insignificant you can also fill the fourth box in this diagram with something that I ve called unique list this doesn t exist in our library yet it exists in an experimental form internally that we re playing with it might eventually make its way out but it s certainly the least useful of these for so affordable visit why would you want to use the multiset_examples I kind of wanted to accept groups I can have duplicates of silly statement but here s an example suppose that you are modeling of card games may be working on an artificial intelligence to play card game poker or many favorite card so you may have the collection of cards make up the user s hand or the makes of the deck expressed as a set and everything was fine until one day you to start work on a game like no or pinochle or blackjack with a six deck shoe because suddenly those cards are unique anymore suddenly you can t is that it so what usually you end up doing is switching to list so they can get those duplicates but then you sacrifice a few things for one thing you re sacrificing her performance of contains because most of it was unaware of house those linear search for contents another thing is that he lost the ability to tell whether this hand of cards and a set of cards were present the same exact hand of cards you may have some sort of table for your computing go for this hand of cards I think I would know that this heart drug have this probability of success and if you can t tell when to handle the same except for order a new one of doing work twice and considering different when they re really the same so at some point everyone eventually hits the spot would elect and I need to compare two lists bought this at the order doesn t matter to me but I won t compare to us and in writing utility method for this and there s a few things you can do to sort multiple lists First Amendment paradigm with the normal equality method that only works if the elements in the list are comparable or if you have some comparator to use them if it sucks A. just out of luck all you also do something where you create a temporary collection and you remove it and put everything in one and remove them as you go knows but do we really shouldn t have to jump through all those hoops with all these hoops it s really assignment was using the wrong type of collection we really should be using multiset if you re using multiset to represent this hand of cards in something called.equals on the multiset us in the other multiset to get around what the most popular use for multiset is for histogram type structures and I should assure that integer valued histogram type structures so an example is from know what tags to my point articles on my blog and I become so using to these tags on iterate over all their articles on my blog and put all these tags and then on to find out always job a new job is my number one Italian Britney Spears number two in a amount about my blog in common and the multiset is good for this because the performance is one of very based on the number of distinct elements not just on the total size of the collection of figures down from the sort of out of order for multiset I m showing you what it would suck if you don t have the multiset was the code that many of you have probably written several times your mouth where the objects are interested in are your keys and the values are in a integer or get the red or atomic integer or there s all kinds of variations on this and will loop through our old weblog posts for each one were to get its tags and how you figure out how to get the center so we got a check whether our map already contains an integer value and if it does risk and increment that if it doesn t we got it started off at 0 ng or started off at one and most of them always work them were able to ask a few questions that woman also if you want to know what all these distinct tags I was using we just take the keys to a monopoly populated if we want to know how many times I use the Java tag that instinctive thing to do would be to call tags.dat quote job bubble be careful because if you never added any tags called Java this is in a return null and injured in a traffic assets manager and ago null pointer exceptions have to be careful find out in is Java even in this map if so then use the value otherwise user and then suppose what that total count of every time I tied to a blog post groups a know you ve got either iterate over the whole map and sum them up or maybe you should ve been summing them up from the beginning who was the big and always a pain but as either the research for is running multiset I looked over the corpus of football the Java code at Google are you there was no but there were no two source files to do this exact same way there s always just like ever so many slight variations on this on doing this and a third of them have blogs service or product me for solution was in my defense of the Java collections tutorial to show you this but that in no way diminishes the value of this work Josh is correct the Java tutorial contains this and many other nuggets of wisdom you should be going there first before using your stuff on so you would look like after I connected a map I want to multiset and then first post I get its tags enough room in the multiset I m done I ve will this empty space there I was going to picture my daughters and stick it in there just to have something am to go to the rent on time on competitiveness before after before after I am and now the one asked distinct tags were use a method on the multiset cold elements that that gives you a set containing all the distinct elements that are found in your multiset if you account for the Java tied its tags.Count job arguably it might even be called to account but I don t think so with a blood count on adjustable return 00 or more lever return a negative number until the total count is just the total size of your multiset because every time you come through nicely total sizes just were looking for him but you know the difference doesn t stop there because I don t want to simply that if you look at the size of this code in the size of this code and you decide based on this that multiset is better while it s true but you re also missing the bigger picture of the bigger picture is that your software is committing to change and continue to evolve and when you re using the right abstraction using a powerful library them as your needs of all flybridge} along with you do what you need so absurd now you need to remove tags from this collection of AKA decrement the values of these tags now you write the more code you got a note each time you need to remove some things you need to check that you re not accidentally going to send the account negative as I would be nonsensical and if you get down to zero enough to figure out what to do should it prove that entry for multiset intercellular deal with that and if you don t and you could have a memory leak and I use the singer to grow without bound him in order to deceive the multiset and a scholar of them what you want concurrent access with you know we have a lot of servers that are continually gathering statistics on all companies like suppose you re an HDTV server every time it sends status code like to unearth real to Reel four it wants to get running tally of how many of these things done well if using this map this approach are pretty much up to walk the entire map while you re doing all that stuff if you re using multiset you just change implementation from hash multiset to concurrent multiset and ago I got multi-threads multiple threads simultaneously increment his values and decrement values of safely without stepping

~ by bitahatini on January 9, 2009.

Leave a comment