Scala Scripting, XML Literals, What the encodings?!

I read recently about how Scala can be used as a scripting language and decided to use it to help streamline my turn-sketch-into-applet process. These first two sketches I put up showed me that there were four basic steps to making a sketch presentable. I fired up MS word and decided that my script should, after taking a file name for input:

1. Create a sketch folder
2. (a) Generate a proguard configuration file suited for the sketch and then (b) run proguard using that configuration file, outputting a jar in the sketch folder (the configuration file should also be saved in the sketch folder)
3. copy the source code of the given sketch into the sketch folder
4. generate an “index.html” file pointing to the applet and referencing the source code
4a. Possibly ask for “mouse and keyboard controls”, and an optional “extra comments”

Creating a folder is very easy with java.io.File, so that one was a no-brainer. #1 Check.

I hadn’t really looked into Scala’s IO. I decided to go with Apache Commons IO to do most of my file copying/writing, since their FileUtils class has basically exactly what I needed. Writing the configuration file was as simple as FileUtils.write(configFile, configString), and copying the source code similarly also only took one line. #2a and #3 check.

I haven’t yet touched Scala’s process capabilities either. Scala 2.9.0 has a new process library taken from Akka, but I’m still running 2.8.0 so I just went with java.lang.Process. My original attempts revolved around Runtime.getRuntime().exec(""), but exec seemed unable to send arguments to the command (e.g. exec("java -version") won’t give you anything in the process.getInputStream). ProcessBuilder seemed to have better infrastructure so I looked at that instead; of particular interest was the redirectErrorStream(boolean) method, which helped me out a lot. It turns out that the reason I wasn’t getting any messages back from the Processes is because the programs I was testing outputted to the error stream instead of just the output stream. I probably should have considered that, but hindsight is 20/20. #2b check.

Generating index.html was threw me for a loop; since Scala has XML literals, I copied the HTML directly into the script. There is javascript code within the HTML, and javascript happens to use curly braces to denote map literals. Scala interpreted those curly braces to represent escape blocks. I ended up re-writing the map literal curly braces within the escape blocks, like so: var parameters = {"{ };"}. Again, hindsight steps in and says something like “You should’ve just written a raw string literal instead of using XML; then you wouldn’t be having this problem!”. But hindsight really enjoys telling you how much better HE could have done it. You can learn a lot from him but he sure acts like an ass sometimes :)

There’s a gotcha when writing Scala code to specify attributes. Curly braces will NOT be interpreted as a Scala escape block if it’s within quotes:

scala> val name = "Xiaohan Zhang"
name: java.lang.String = Xiaohan Zhang

scala> <object name="{name}"/>
res0: scala.xml.Elem = <object name="{name}"></object>
//Tautological attribute is a tautology. AND it's an attribute!

The way to do it is to just let Scala put in the quotation marks for you:

scala> <object name={name}/>
res1: scala.xml.Elem = <object name="Xiaohan Zhang"></object>

#4 check. Now, I didn’t expect the script to run the first time I fired it up; I was dealing with unfamiliar terrain and regardless, programming is very difficult to do right the first time. But the error I got completely caught me off guard:

IO error while decoding C:\Users\hellochar\Documents\dev\NetBeansIntelliJ\Daily\Script.scala with UTF-8
Please try specifying another one using the -encoding option

… What? Like most people, I haven’t really taken the time to learn about my encodings. I figured it was probably just some weird character in my index.html file that I copied over. Removing that code didn’t work. (Originally I tried just commenting it out but I realized that the problem was not in the logic of the code; it was in the compiler’s attempt to even READ my code).

But then I took out ALL of my code and tried running the program, and it still didn’t work.

I decided that my best route of action was to actually figure out WHAT the error code meant (my hindsight gave me a slightly condescending clap for this). Fortunately, I ran into this article very early on in my search. This is a blessing, because I really enjoy Joel’s writing. He reminds me of my high school physics teacher and FIRST robotics mentor; there’s a no-bullshit air about him and mixes in his ideologies with his teachings, which makes for both an interesting read and something to think about afterwards. The article was a prime example. After reading it I figured the problem probably lay in some weird character I was using within my code. I kind of just stared at this screen for a while:
Where are you hiding?
AND THERE IT WAS.
Getting closer...

Closer...

CLOSER!


FUCK

FUCK. YOU. MS WORD. The problem was that I had copied and pasted the description of what my script should do from Word, which put in those “smart” quotes. The particularly annoying thing is that UTF-8 supports those types of quotes (because it supports every character), but I guess MS Word uses a different encoding from the rest of the world. On the upside, it’s an easy fix; just copy the text into Notepad, “save as” in UTF-8 encoding, and then copy/paste the text into the script. You won’t notice any difference, except for that the code will actually run. The problem is that the scala spec requires you to only use a specific subset of Unicode to construct tokens (the smart quotes are not in that set). Originally I thought that the interpreter just ignored all characters inside a comment block, but then I realized that the interpreter wouldn’t be able to tell when the block ends. You just have to get rid of the character.

After I had the basics down, I tried to go a little further by attempting to extract the size of the applet from within the source code. e.g. I wanted to extract “800” and “600” from “size(800, 600);”. Regex combined with scala.io.Source class made this pretty straightforward:

val Size = """size\((\d+), (\d+)\);?""".r //trying to find size(400, 400);
val (width:String, height:String) = try { scala.io.Source.fromFile(sourceFile).getLines.map(_.trim).filterNot(_.startsWith("//")).find(_.startsWith("size(")) match {
  case Some(string) => string match { case Size(w, h) => (w, h) }
  } //nonmatching cases will throw matcherror and get catch-ed
} catch {
  case _ => ("900", "500") //default to that
}

I grab each line from the source file, take out the whitespace, get rid of comments (it won’t catch /**/ but that’s much too complicated for me to care about right now), and find the first line that starts with “size(“. If one exists, I try matching it against my size regex to extract the digits for width and height, respectively. If anything fails, just default to 900 by 500. I can then feed those numbers into the width and height parameters of index.html.

All in all, I’m pleased with the scripting aspect of Scala; the only downside is that the script takes a while to start up each time you want to run it.