3.5 Kernel-Text

Notably, this category contains classes Character, String and Symbol. String instances are collections of Character instances. All these are limited to the small ASCII character set. The corresponding classes that can handle the much larger Unicode character set are UnicodeCodePoint, UnicodeString and UnicodeSymbol. As stated before, the Unicode classes can also handle ASCII, and they are interchangeable. You usually don’t need to care about which flavor (ASCII or Unicode) an instance actually is.

Character/UnicodeCodePoint. An individual character is written prefixed with a “$”, for example: $A or . It can be defined with the class side method codePoint:. Note, however, that you can ask Character only about valid Characters. It is generally safer to ask UnicodeCodePoint instead:

Character codePoint: 65 ⇒ $A
Character codePoint: 966 ⇒ nil
UnicodeCodePoint codePoint: 966 ⇒ $φ

There are class side methods for non printable characters: Character tab, Character lf, etc.

Additionally, UnicodeCodePoint defines a #namedCharactersMap, that lets you enter many Unicode characters easily, like:

\leftright then press spacebar.

As each string is a collection of characters, when iterating a string we can use the Character instance methods:

'There are 12 apples.' select: [:c |c isDigit].
⇒ '12'

Example 3.13: Twelve apples

 CuisLogo Modify Example 3.13 to reject the numeric characters.

Exercise 3.7: Select apples

String. String is a very large class, it comes with more than 200 methods. It is useful to browse these method categories to see common ways to group methods.

Sometimes you may not see a category related to what you’re looking for right away.

 note If you expect a method selector to start with a specific letter, click-select the -- all -- method category, then move the cursor over the pane listing the method names. Press this character, e.g. $f. This will scroll the method pane to the first method name starting with an “f”.

Consider the case where you need to search for a substring, a string within a string: when browsing the String class, search for method categories named like finding... or accessing. There you find a family of findXXX methods. Read the comments at the beginning of these methods:

findString: subString
   "Answer the index of subString within the receiver, starting at
   start. If the receiver does not contain subString, answer 0."
   ^ self findString: subString startingAt: 1.

Or:

findString: key startingAt: start caseSensitive: caseSensitive
   "Answer the index in this String at which the substring key first
   occurs, at or beyond start.  The match can be case-sensitive or
   not.  If no match is found, zero will be returned."
   ../..

Then experiment with the potentially interesting messages in a workspace:

'I love apples' findString: 'love' ⇒ 3 "match starts at position 3"
'I love apples' findString: 'hate'
⇒  0 "not found"
'We humans, we all love apples' findString: 'we'
⇒ 12 
'We humans, we all love apples'
   findString: 'we'
   startingAt: 1
   caseSensitive: false
⇒ 1 
'we humans, we all love apples' findString: 'we'
⇒ 1 
'we humans, we all love apples' findString: 'we' startingAt: 2
⇒ 12

Following these paths will, most of the time, lead you toward the answer you are looking for.

 CuisLogo We want to format a string of the form ’Joe bought XX apples and YY oranges’ to the form ’Joe bought 5 apples and 4 oranges’. What message should be used?

Exercise 3.8: Format a string