Showing posts with label Sikuli Script. Show all posts
Showing posts with label Sikuli Script. Show all posts

Learning Resources: Coursera

A few months ago I wrote a post on several learning resources for Java; this time around I thought I'd write about a single learning resource for... well, quite a lot: Coursera.  I imagine most people are already familiar with it, but for those who aren't, Coursera provides online courses in a variety of subject areas for free.  Founded by two Stanford faculty members, it counts some top-notch institutions among its partners (70+ as of this writing).

Currently, courses are offered for free, but Coursera ultimately intends to generate revenue.  To that end, they've begun implementing some pay services; for example, they recently introduced a special service called Signature Track that (for a premium) awards students who complete courses with certificates securely tied to their identities, making it easier to share them with potential employers, etc. It may be their hope that revenue from these ancillary services will enable them to continue providing the course material for free, but I suppose some sort of fee-based system isn't necessarily out of the question in the future.  All of which is to say: I'd recommend availing yourself of the service as soon as possible.

As with a "real" course, the material is broken up into units (normally weekly, in my experience), so you and your virtual classmates progress through the units together; course-specific online forums facilitate communication between students and teachers.  Lectures are delivered via online videos with short questions embedded throughout to gauge your understanding; you can listen to the lectures at your own pace (within the time frame for each unit, of course).  Quizzes at the end of each unit generally comprise part of your course grade, which may also include graded exercises, projects, and exams.

Given Coursera's format, it should come as no surprise that a large number of technology- and programming-oriented courses are offered.  While there are no courses (that I'm aware of) for learning Java, I would recommend the Learn to Program: The Fundamentals and Learn to Program: Crafting Quality Code courses offered by the University of Toronto.  As their names imply, they're intended to provide an introduction to programming concepts in general, but Python is the language of choice so they end up being pretty good introductions to Python specifically, too (Python is the primary scripting language used by Sikuli, the UI automation tool I've discussed in other posts).

If you're hesitant to jump into the pool and would like to sample a course to see if this learning method is right for you, there are currently two computer science courses available for self study: Introduction to Databases and Computer Science 101, both from Stanford University.  Without due dates and a live online forum (and there's no statement of accomplishment for completing self study courses) it's not quite the same thing, but at the very least you can get a sense of what online lectures and quizzes are like.  The Computer Science 101 course may seem very basic to people who work with computers on a daily basis, but for anybody unfamiliar with programming/scripting, the first three units teach some fundamental concepts including variables, dot notation, and flow control, albeit via a simplified proprietary Javascript-like language developed by the course instructor (Nick Parlante-- who also developed Google's Python Class).  The "image puzzles" in the course are also kind of cool.

Sikuli Script: Troubleshooting Problems with Scripts

One of Sikuli's appealing features is its relatively straightforward concept; ironically, that can make it all the more puzzling when things aren't working quite the way they should-- you can see your button right there on the screen, so why can't Sikuli?  This post has a couple of tips for new users for troubleshooting common script problems.

Note: as of this writing, the latest version of Sikuli is version 930, available in Windows as a set of files to be unzipped over the previous version, version 905.  Version 930 includes some significant fixes; if you're experiencing issues and using version 905, I'd recommend upgrading to version 930 as a first step.

If Sikuli can't find a graphic pattern:

1) Be sure to check any FindFailed error messages carefully.  I know, I know-- that may sound obvious, but when you're knee-deep in a scripting session and you've already worked through half a dozen "FindFailed" errors in the last hour, you may see the seventh pop up in Sikuli's log and immediately assume it's the same as the others.  A case in point from my own personal experience: after troubleshooting for some time to figure out why Sikuli wasn't finding a match for my graphic pattern, a closer inspection of the FindFailed error revealed that Sikuli wasn't looking for a graphic pattern at all-- for some reason it was looking for text (treating my pattern file's name as a string).  Among other things, a FindFailed error can also indicate a problem finding a png file in its expected location on disk.

2) If you're looking for the pattern in a defined region (as opposed to a search on the entire screen), make sure the region covers the intended area and fully contains the pattern you're looking for; one way to check this is by calling the highlight() method on your region-- e.g., my_defined_region.highlight(2). You can remove the call later on once you've finished debugging.

3) Don't be stingy with wait times.  Sikuli can interact with an application much faster than a human, and that's not always a good thing.  In some cases it may not find a pattern simply because it hasn't had time to draw completely (in response to some previous action).  Use a simple wait() statement to give the screen, region, etc. more time to draw before proceeding to the next step.

If Sikuli incorrectly identifies a match:

1) A mis-click near a pattern (close but not centered on it) may stem from re-drawing and or re-sizing a window or components in a window. Consider a scenario where Sikuli is waiting for a button to appear onscreen; as soon as it appears it registers the match.  Subsequently, the window finishes drawing, during which the placement of the button is adjusted to make room for other components.  Sikuli doesn't recognize this change in the button's location-- as far as it's concerned, the button is still where its pattern was first detected, and that's where Sikuli will try to interact with it.  Possible solutions in this case include inserting a fixed (designated in seconds) wait() statement or using another pattern as a cue to Sikuli that the screen has finished drawing.

2) If Sikuli keeps matching a pattern to items other than the one you're actually trying to isolate, first make sure your pattern is "focused"-- you generally want to avoid any nondescript surrounding white space that doesn't help Sikuli differentiate it from similar regions.  You can also increase the similarity setting for the pattern to require a closer match.  With your script loaded in Sikuli, bring up the application screen where a match for your pattern should be found and click on the pattern in your script, launching the Pattern Settings dialog.  The Matching Preview tab shows the current screen, highlighting any regions that match your pattern-- if you adjust the Similarity slider, the preview automatically updates to reflect the new value.  Yet another solution: limiting your search to a specific region around your target, excluding other potential matches outside the region.


The Matching Preview tab of the Pattern Settings dialog

For scripting issues not necessarily related to finding pattern matches, the popup() method can be used as a poor man's break point for debugging.  For example, you can use it to pause your script and print out the value of a counter variable in a for loop, a value evaluated in an if statement, etc.  The method can be removed from your final script once you've finished troubleshooting.

Finally, when all else fails or you're seeing completely inexplicable issues, save your work, shut down Sikuli, and check for any orphaned processes (in Windows, these appear as javaw.exe instances) before restarting.  I've seen scenarios where a script didn't seem to shut down properly (particularly when execution was interrupted with Alt + Shift + C) and behavior becomes flaky and unpredictable when I attempt to re-run the script or run another script.  Clearing out these orphaned processes and re-starting Sikuli sometimes helps in these cases.

Sikuli IDE: Using Pattern Objects

If you go back to the script illustrated in the last post and look at the code for click() or wait() calls in IDLE or some other IDE for Python (other than Sikuli's own editor), you'll see something like the following:
...
wait("FileEditForm.png",5)
...
click("1351739445615.png")
Beneath all the source images in our script (as displayed in Sikuli's IDE) are strings matching the image file names.  When rendering the script in its IDE, Sikuli interprets these strings as instances of its Pattern object, replacing them with the images they represent.  It's important to recognize that we are not limited to using Pattern objects within click() and wait() calls directly-- we can manipulate them programmatically to make scripts more flexible and/or efficient.

To illustrate, let's say we have an app that looks like this:


When the user clicks on any of the color buttons along the bottom of the app form, the large "Color" text changes to match that color.  A straightforward way of writing a test script would be to write four pairs of click() and exists() methods, with each click() on one of the four buttons followed by an exists() check on the "Color" text (to verify it's displaying the right color).  But keeping Pattern objects in mind we could also organize our script like this:


A Sikuli script with Pattern objects organized in a list

In this script, the Pattern objects (images) are organized into a list of tuple pairs; the first image in each pair is the button being clicked, and the second is the "Color" text as we expect it to appear when each button is clicked.  We can use the image list with a for loop to check all four buttons. Using a list like this opens up other possibilities, too-- for example, we could easily step through it randomly instead of following a set order. Obviously, this is an oversimplified example, but it still reflects a fundamental idea behind UI testing that can be applied to more complex applications: interacting with one UI element (via a click, for example), then verifying expected behavior by confirming the existence (or non-existence) of some associated visual "cue".

A few words about some "gotchas" when working with source images: going back to the example in the first paragraph about file name strings, there's no path information necessary with this particular example because the files are located in the same directory as the Sikuli script.  Of course, if you wanted to put your source images in a different directory, you'd have to add corresponding path information to the file name string.  Be aware that as of build 905 of Sikuli X, moving the images out of the main .sikuli folder for your script can also result in problems with rendering them properly in Sikuli's IDE-- you'll see some errors appear in the Message pane at the bottom of the IDE when loading your script, and where you'd expect to see images you may see the underlying strings instead. Here's what my image list looks like in Sikuli's IDE after moving the images to a sub-directory in the .sikuli folder called "Imgs" (and modifying the script accordingly):

image_list = [("\\Imgs\\EE.png","\\Imgs\\Color.png"),
        ("\\Imgs\\1354395248414.png","\\Imgs\\C0l0r.png"),
        ("\\Imgs\\1354395299725.png","\\Imgs\\C0l0r-1.png"),
        ("\\Imgs\\1354395332765.png","\\Imgs\\C0l0r-2.png")]

The script still runs successfully, but the images themselves are not loaded and displayed in the IDE.

One alternative way to deal with source images located in directories other than the .sikuli folder is the addImagePath() method.  Sikuli maintains an image search path-- a list of directories that it will search when a given image can't be found in the active .sikuli folder.  So for example, if you moved an image into a directory called "C:\Temp\Images", you could just add the directory to the image search path using the following code:

imgPath = list(getImagePath())
if 'C:\\Temp\\Images' not in imgPath:
    addImagePath('C:\\Temp\\Images')

Subsequently, the full path for any images in that directory would not have to be specified-- you could just specify the image file name by itself; Sikuli would automatically include the "C:\Temp\Images" directory when looking for the image.  Note that while this method does save some time typing out paths when specifying images, it may still result in the same problems explained above when using non-.sikuli folders for images.

Sikuli IDE: A Basic Walkthrough

This is a brief Windows-oriented tutorial that walks through a Sikuli IDE test script (there are tutorials provided on the Project Sikuli site, too).  This script demonstrates how to create and manipulate the Sikuli App object and perform some basic UI actions using the Notepad application.

Here's the starting point for your script:

NPPath = 'C:\\Windows\\notepad.exe'
#Open Notepad using the App object
NPApp = App.open(NPPath)
#Wait until the menu bar is visible, then display message
#that the window is open.
wait("Your Notepad Menu Bar",5)
popup("Window is open!")
menuBar = getLastMatch()#Takes advantage of wait function side effect
#Highlight the menu bar region for 3 seconds then click on text
menuBar.highlight(3)
menuBar.click('Help')
#Define a new region using 'Help' text as starting reference
dropDownRegion = Region(menuBar.getLastMatch().getX(),menuBar.getLastMatch().getY(),300,100)
dropDownRegion.click('About')
wait(3)
click("Your OK Button")
type("This is a test message")
wait(3)
NPApp.close()

Open Sikuli IDE and paste this into the IDE window.  You'll be unable to run it as is, though-- you need to supply at least two graphic elements: the Notepad menu bar and the Notepad About dialog OK button.  Highlight the text "Your Notepad Menu Bar" (including the quotes) in line 6 and, with Sikuli IDE still open, launch Notepad.  Press CTRL + SHIFT + 2-- you should see your screen "fade" slightly as Sikuli IDE enters its capture mode.  Click and drag the cross-hair across the menu bar items-- it should look something like this:


Capturing the menu bar

When you release the mouse button, the screenshot of the menu bar should automatically get inserted into the script:


Script with menu bar screenshot pasted

In Notepad, click Help in the menu bar, then "About Notepad" and repeat the steps above, this time replacing the "Your OK Button" text near the bottom of the script with a screenshot of the OK button in the About Notepad dialog.


OK button pasted in

Now try running the script (note that when the "Window is open!" message is displayed, the script is paused until you click the OK button).  Also make sure the path to Notepad is valid for your particular version of Windows (the script was written on a PC running Windows 7).

Now let's step through the lines of the script; if you're experiencing problems I'll try to highlight some "gotchas" as we go that might be the source.

In the first few lines of the script we're opening Notepad by creating a Sikuli App object-- the App.open() function does this using the path to your target application.

Sikuli's wait function (line 6) scans the screen for the existence of the indicated (text or graphic) element; the second parameter indicates how long Sikuli should wait (in seconds) for the element to appear before failing (you can use the constant FOREVER to specify that Sikuli should wait indefinitely)-- here we're giving Notepad five seconds to launch.  If you have a particularly slow machine where it seems to be taking a while for Notepad to launch, try upping this wait time parameter.  There's a corresponding function in Sikuli called waitVanish()-- this function behaves similarly to the wait function, but scans the screen for the absence of a given element (i.e., waits for it to vanish from the screen).  Other similar functions: find(), which (no surprise here) simply finds the given element onscreen without acting on it; click(), which we'll see below and finds and clicks on the specified element; and exists(), which checks for the presence of an element onscreen.

The popup() function (line 7) simply displays a message box with the indicated text-- as I mentioned previously, Sikuli will pause after the popup is displayed and wait for the user to click the OK button before continuing.  This can be useful for troubleshooting scripts or pausing the script to allow for a manual task.

In line 8, the menuBar region is retrieved using the getLastMatch() function-- this illustrates a useful side effect of many of Sikuli's basic functions, including the wait() function-- when the function is successful (e.g., the element that wait() is waiting for is detected) the location of the element is stored as a match that can be retrieved later via the getLastMatch() function.  So in the case of our script, once the wait() function in line 6 is successful, it's not necessary to re-find or otherwise identify the menu bar pattern (which can be a costly operation in terms of processing resources).

In line 10, the highlight() function is called for the menuBar region we just defined, which simply highlights the region for the given amount of time (in seconds).  This can be useful for troubleshooting scripts in instances where you're dynamically defining a region that's supposed to contain a specific element-- you can use the highlight to see what's actually included in the region.  A word of warning about the highlight() function, though:  I've seen cases where it may result in a redraw that "hides" some child windows (like drop-downs, for example).

In the next line, text recognition is used to identify the Help button in the menu bar and the identified area is clicked.  As I mentioned in my previous post, Sikuli IDE's text recognition can be a little finnicky sometimes.  If you seem to be having problems with this particular action in the script, try using a graphic element (an actual screenshot of the File button, captured as described above) instead of text.  Text recognition seems to work best for cases like this: a short string with a simple well-contrasted setting (e.g., black text on white).

One more important thing to point out in line 11 is the use of click() as a method of the menuBar region.  When functions are called as methods of regions, Sikuli only scans those regions, not the entire screen.  In fact, in line 6 of the script when we called the wait() function without specifying a region, it's called as a method of the default SCREEN region-- a region encompassing the entire screen.  Generally it's good practice to call these functions on regions instead of the entire screen since it's less costly to scan a subsection of the screen as opposed to the entire screen.

Line 13 defines a new region dynamically.  A Sikuli Region object's area is defined by four parameters-- x and y coordinates that define its upper left corner, and width and height parameters.  In this line in the script we're using the getLastMatch() function again, this time for the menuBar region.  The click() function in line 11, also called for the menuBar region, recorded the location of the "Help" text as a successful match; getLastMatch() returns it.  Using the match as our starting point, we can define a new region by getting the X and Y parameters of the "Help" text's region-- via getX() and getY()-- then making our new region 300 pixels wide by 100 pixels tall.  This line could cause problems depending on your computer's resolution-- if you find that the next line is failing (even after ruling out a text recognition failure) try making the region larger in this step-- it could be that the 300 by 100 pixel region is too small to encompass the About Notepad button.

The last few lines are self-explanatory or repeats of functions we've already seen: Sikuli finds and clicks on the text "About" in the newly defined dropDownRegion, then waits for 3 seconds as the About dialog is displayed.  After three seconds, the OK button is identified (via a screenshot this time, not text) and clicked. The text "This is a test message" is typed into Notepad's main window and finally, after a 3 second wait the app is closed using the App object created in line 3.

Of course, this is a very simple script and only illustrates the tip of the iceberg of Sikuli's power.  If you're interested in a more detailed look, I recommend checking out Sikuli's documentation (which I think is pretty well-written) here.

Sikuli IDE: An Overview


Switching gears quite a bit from my posts so far, in this post I'm going to give a quick overview of Sikuli IDE, a free UI automation tool.

On the face of it, UI automation testing seems like a pretty straightforward concept: mimic the actions that a user would perform while interacting with an application and confirm the application responds to those actions appropriately.  When you roll up your sleeves and start trying to do it, though, you quickly realize that it's often much easier said then done and there are lots of pitfalls and hurdles to overcome.  If you try to access UI controls programmatically (as objects), you may come up against custom controls that your tool of choice can't recognize , controls that aren't uniquely named, or seemingly infinite nested levels that make identifying and isolating individual controls difficult.  Tools that use UI manipulation (e.g., move cursor to coordinates x,y and click, tab to button 1, press the down arrow twice, etc.) can result in scripts that are obscure and/or diffictult to maintain.

Sikuli IDE (and other tools like it like Citratest and DeviceAnywhere) uses a technique that tries to smooth over some of these issues encountered with other strategies.  Basically, you capture GUI elements (buttons, links, etc.) in small graphic files; during script execution, Sikuli uses these captures to identify those elements onscreen and interact with them.  The strategy maintains some of the modularity of an object-oriented approach without any dependencies on the application's underlying architecture.

The documentation provided on the Project Sikuli site is decent, organized in a logical, straightforward manner by key objects.  As its name implies, Sikuli IDE also comes with its own IDE, where graphical "objects" are represented as images, making dealing with scripts very intuitive (Python/Jython is the language used for scripting in Sikuli).


Sikuli IDE

Of course, as you might expect from a tool that keys off graphics, Sikuli is less than ideal in situations where the UI is still unstable-- although the easy-to-use capture interface (basically, pressing CTRL + SHIFT + 2, then selecting the region you want to capture) makes it relatively easy to adjust to UI changes.  While Sikuli does claim to have some text recognition functionality, this seems to be in the very early stages and very unreliable at this point.


Sikuli's text recognition (sort of)