Previous lesson Back to the front Next lesson

Stringing Along


Remember in the last lesson that we learned about the eight primitive types. Although they are useful, they are also quite limited - we would like to be able to use more complex types. In fact, Java does have many other types of things, called objects. Anything which is not one of the primitive types is an object. In the next few lessons we will learn about using objects, and eventually you will be able to create your own!

To introduce objects, we're going to look at one which is built into the Java language - the string. In fact you have already used it several times, right from your first program. Now objects are much like primitive types in many respects - for example you can store an object in a variable:


class Hello {
	public static void main(String[] args) {
		int age;
		String name;
	}
}

You can see we've created two variables, one called age which can store an int, and another called name which can store a String. That much is easy, right? The first thing to notice is that the primitive types all start with a lowercase letter, but String has a big S. In fact, all object types should start with a capital letter. It's not a rule in Java, but it helps to make your programs less confusing. Also, object types should use camel notation which we learned about much earlier. We can tell the difference between an object type and a variable because variables should start with a lowercase letter. Here's an example (don't change your program):


int distanceInMiles;
String nameOfCity;
RouteMap route;

I hope this makes my point about names clear. I've used an imaginary object type called RouteMap. Forget that, and change your program slightly:


class Hello {
	public static void main(String[] args) {
		int age = 19;
		String name = "Ben";
	}
}

Now we've used an integer literal (19) and a string literal ("Ben"). Again very easy stuff. So let's add something to each variable, and print them out.


class Hello {
	public static void main(String[] args) {
		int age = 19;
		String name = "Ben";

		age += 2;
		name += "Golding";

		System.out.println(name);
		System.out.println(age);
	}
}

I've taken the chance to revise some stuff from earlier - how to add something to a variable using the += operator. As you can see it works for strings too.

[nurmes]btg: javac Hello.java
[nurmes]btg: java Hello
BenGolding
21
[nurmes]btg: 

And finally, you can copy an object to another variable:


class Hello {
	public static void main(String[] args) {
		int age = 19;
		String name = "Ben";

		int i = age;
		String s = name;

		age += 2;
		name += "Golding";

		System.out.println(s);
		System.out.println(i);

		System.out.println(name);
		System.out.println(age);
	}
}

I hope by this stage you are feeling confident that objects aren't going to be too hard or scary.

[nurmes]btg: javac Hello.java
[nurmes]btg: java Hello
Ben
19
BenGolding
21
[nurmes]btg: 

Make sure you understand why the program does what it does. Now remove some lines until it looks like this:


class Hello {
	public static void main(String[] args) {
		String name = "Ben";

		System.out.println(name);
	}
}

Now a string is just an object representing a set of characters (letters and symbols). We (as humans) say that two strings are equal if the characters they contain are the same. However, in Java we can have two independent strings which contain the same characters. It's much like two people with the same name - they are still different people.


class Hello {
	public static void main(String[] args) {
		String name = "Ben";
		String name2 = "B";

		name2 += "en";

		System.out.println(name);
	}
}

We now have two different "people" with the same name. There is a reason why I've created the second one in a slightly strange way, which I'll explain later. Now let's use a builtin function of strings to ask if the two names are the same (I'll explain exactly how this works later).


class Hello {
	public static void main(String[] args) {
		String name = "Ben";
		String name2 = "B";

		name2 += "en";

		System.out.println(name.equals(name2));
	}
}

When we say name.equals(name2), we are asking if the two strings contain the same set of characters. This gives us a boolean value, so the program can only print true or false.

[nurmes]btg: javac Hello.java
[nurmes]btg: java Hello
true
[nurmes]btg: 

I suppose we should have expected that - we designed the two "people" to have the same name, after all. But there's another type of equals for strings (and all other objects):


class Hello {
	public static void main(String[] args) {
		String name = "Ben";
		String name2 = "B";

		name2 += "en";

		System.out.println(name.equals(name2));
		System.out.println(name == name2);
	}
}

When we use the == operator with objects, we are asking if they are the same object.

[nurmes]btg: javac Hello.java
[nurmes]btg: java Hello
true
false
[nurmes]btg: 

The verdict: the two people do have the same name (so the first line says true) but are not the same person (so the second line says false). Make sure you have this idea clear in your mind - we can have two objects which are equal, but are distinct objects. Now we're going to change the program slightly:


class Hello {
	public static void main(String[] args) {
		String name = "Ben";
		String name2 = "Ben";

		System.out.println(name.equals(name2));
		System.out.println(name == name2);
	}
}

You might think that this version of the program makes more sense. Just try running it:

[nurmes]btg: javac Hello.java
[nurmes]btg: java Hello
true
true
[nurmes]btg: 

It seems (from the second true) as if both name and name2 actually contain the same string! In fact this is exactly what has happened, but why? Well, when the compiler (javac) sees a string literal (like "Ben") which is repeated, it uses the same string for both. Not two strings containing the same characters (which would use twice as much space in the computer's memory), but the actual same string.

So supposing you write a program to compare two strings, should you use equals or the == operator? 99% of the time, you should use equals when comparing strings (or any other type of object in fact). You should only ever use == if you want to check that they are the very same object, which is not often. The reason I explained the confusing point with string literals above is that sometimes it can cause == to work (as in the program which prints true twice). Unfortunately sometimes it doesn't (as in the program which prints true and then false). If you don't understand any of this then just remember not to use == with object types (except in very special cases).

This is probably a good point to stop and explain the basic difference between objects and primitive types. The primitive types each use a very small fixed amount of the computer's memory. Objects can vary in size from very small to very big (a string for instance could contain one character or an entire novel). So copying large objects around would be very inefficient. So objects are handled using object references. Allow me to explain:


class Hello {
	public static void main(String[] args) {
		String name;
	}
}

We have created a variable called name of type String. In fact the variable only stores an object reference to an object of type String.


class Hello {
	public static void main(String[] args) {
		String name = "Ben";
	}
}

Two things have happened now. The first is that when we write "Ben", a string object is created in the computer's memory. This string is completely separate from the variable name. In this case the string is very short, so it only takes up the same space as three characters, but it could be much more. The second thing is the = operator copies an object reference into the variable name. This object reference tells us where to find the actual string object in the computer's memory (remember that it is separate from the string variable).


class Hello {
	public static void main(String[] args) {
		String name = "Ben";
		String name2 = name;
	}
}

Now we have copied the string from one variable to another. What this does is simply copies the object reference, so that the name2 variable also knows where in the computer's memory to find the string. Imagine the string did contain an entire novel, then copying one object reference like this would be a lot quicker than copying millions of characters.

I hope you're keeping up with all this object reference stuff. It's just an efficient way to deal with potentially large objects, and mostly you don't need to worry about it. Here's an interesting question: both the variables above now contain a reference to the same string. So if we change the string, do they both change or what? Let's find out:


class Hello {
	public static void main(String[] args) {
		String name = "Ben";
		String name2 = name;

		System.out.println(name);
		System.out.println(name2);

		name += "Golding";

		System.out.println(name);
		System.out.println(name2);
	}
}

Run the program and...

[nurmes]btg: javac Hello.java
[nurmes]btg: java Hello
Ben
Ben
BenGolding
Ben
[nurmes]btg: 

Was that what you expected? It appears that name and name2 are different now! But didn't we say they were actually referring to the very same string? Let's check - remember that == lets us check if two variables refer to the exact same object.


class Hello {
	public static void main(String[] args) {
		String name = "Ben";
		String name2 = name;

		System.out.println(name);
		System.out.println(name2);
		System.out.println(name == name2);

		name += "Golding";

		System.out.println(name);
		System.out.println(name2);
		System.out.println(name == name2);
	}
}

 

[nurmes]btg: javac Hello.java
[nurmes]btg: java Hello
Ben
Ben
true
BenGolding
Ben
false
[nurmes]btg: 

Interesting... the two variables start off referring to the same string. In fact name2 is still referring to the same string afterwards. The reason for this is another property of strings - they are immutable (they cannot be changed). Once a string is created, you can think of it as having an airtight seal - you can see what it contains but you cannot add or remove any characters.

But hold on, haven't we changed the value of name? Not exactly. We've taken the string which was referred to by name, ie "Ben". We've then created a new string and put in it "Ben" plus some extra letters ("Golding"). Then we've changed name to refer to this new string. This is how adding of strings is done.

So name now refers to a totally new string, which is why name == name2 is false. To "change" the value of name, we've created a new string and copied the contents of the old one across to it.

The advantage of this is that we can have as many variables which refer to a string as we like, and be confident that none of them will be able to change it under our noses. Strings really can't be changed. The disadvantage is that adding anything to a string (as in our program) involves copying both the original string (eg "Ben") and the string to be added (eg "Golding") into a totally new string. Doing this too much isn't very efficient so there is something called a StringBuffer that you might meet later, which is better for fiddling around with. While strings, and some other objects are immutable, this is certainly not true of all objects as we'll see later.

That really is it for this lesson. Most of what you've learned about strings applies to other types of objects too. In the next lesson on objects, we'll make our own object type and have a play with it. The main point from this lesson was about object references. Please fill out the micro-questionnaire and go on to the next lesson.

But
what
do
you
think?
This lesson was...
What don't you understand? Other comments?
Excellent
OK
Poor
Your email (optional)
Go on to Being Methodical

Too patronising? Too complex? Typing error? Offended by traffic cones? Got a question or something I should add? Send an email to ben_golding@yahoo.co.uk !

visits to this site

The contents of this site are copyright of Ben Golding