Tuesday, September 29, 2009

Extract sequences (unapplySeq)

This topic continues the previous topic on matching and Extractors. Make sure you look at Extractors 1.

The first extractor topic covered the unapply method and how it is used during matching. Today I want to visit a similar method unapplySeq, which is used to match sequences. The method defunapplySeq(param):Option[Seq[T] can be used instead of unapply.

Note: if both unapply and unapplySeq are defined only unapply is used.

When matching on Sequences the _* symbol means to match an arbitrary sequence. We use this several times in the examples below
  1. scala>object FindAs {
  2.      | defunapplySeq(string:String):Option[List[String]] = {
  3.      |    def containsA (word:String) = word.toLowerCase contains "a"
  4.      | 
  5.      |    if (string.toLowerCase contains "a") {
  6.      |      val words = string.split ("\\s+").
  7.      |                         filter (containsA _)
  8.      |      Some(words.toList)
  9.      |    } else {
  10.      |      None
  11.      |    }
  12.      | }
  13.      | }
  14. defined module FindAs
  15. // as usual you can use extractors to assign variables
  16. scala>val FindAs(a,b) = "This sentence contains 2 a-s"
  17. a: String = contains
  18. b: String = a-s
  19. // If you only care about the first variable you can use _* to 
  20. // reference the rest of the sequence that you don-t care about
  21. scala>  val FindAs(a, _*) = "A crazy a sentence ack!"
  22. a: String = A
  23. // using b @ _* we can get the rest of the sequence assigned to b
  24. scala>  val FindAs(a, b@_*) = "A crazy a sentence ack!"
  25. a: String = A
  26. b: Seq[String] = List(crazy, a, ack!)
  27. // standard matching pattern
  28. scala>"This sentence contains 2 a-s"match {          
  29.      | case FindAs(a,b) => println(a,b)
  30.      | case _ => println("whoops")
  31.      | }
  32. (contains,a-s)
  33. // In this example we only care that it can match not the values
  34. // so we ignore all of the actual sequence by using: _* as the parameters
  35. scala>"This sentence contains 2 a-s"match {          
  36.      |  case FindAs(_*) => println("a match")          
  37.      | }
  38. a match
  39. scala>"This sentence contains 2 a-s"match {
  40.      | case FindAs( first, _*) => println("first word = "+first)
  41.      | }
  42. first word = contains
  43. scala>"A crazy a sentence ack!"match {
  44.      | case FindAs( first, next, rest @ _*) => println("1=%s, 2=%s, rest=%s".format(first, next, rest) )
  45.      | }
  46. 1=A, 2=crazy, rest=List(a, ack!)

2 comments:

  1. This is definitely a nice article, but do you have any idea how matching is build for sequences like the following?

    List("a","b","c") match {
    case a :: b :: c :: Nil => a+b+c
    case _ => "Bah!"
    }

    I could probably look at the source for the list class, etc, but I thought a well thought out answer would be better :D

    ReplyDelete
  2. Hi Dan, sorry for the delay. I have been on vacation for the last 3.5 weeks.

    This is an excellent point. There is an additional rule that applies when creating extractors. If the extractor returns Option[Tuple2[_,_]] then you can use this form.

    I will leave it as that for now because I am creating an extry on this topic that I will post tomorrow.

    ReplyDelete