2008-03-31

Beautiful Error Bars in R

One of the reasons why I haven't made the switch from R to SPSS is R's lack of proper error bar graphs. I use them frequently because they are easy to interpret: If you plot the means of several groups of participants in one error bar chart and scale the error bars to a length of one standard measurement error, non-overlapping error bars indicate a significant difference between the according means. In fact, the APA advocates the use of error bars for reporting results since 2005 [1]. This way of reporting differences in means is also called "Inference by Eye" [1].

After my rants about SPSS, my wise R mentor, Stephan Kolassa, pointed me at the gplots library that features a good function for drawing error bars in R: plotCI(). Stephan also pointed me to Rseek.org, an excellent search engine for R related queries. I fiddled with Stephan's example code in order to reproduce my SPSS clustered error-bar chart from last week's post on stereotype threat in complex problem solving:


And this is how I got in in R:
I like it very much; the only thing I need to work out is how to offset the bars in the same conditions so that overlapping error bars don't actually overlap but are drawn next to each other with a few pixels between them.

If you would like to try this out for yourself, here is the R code that produces the image above:

# Clustered Error Bar for Groups of Cases.
# Example: Experimental Condition (Stereotype Threat Yes/No) x Gender (Male / Female)
# The following values would be calculated from data and are set fixed now for
# code reproduction

means.females <- c(0.08306698, -0.83376319)
stderr.females <- c(0.13655378, 0.06973371)

names(means.females) <- c("No","Yes")
names(stderr.females) <- c("No","Yes")

means.males <- c(0.4942997, 0.2845608)
stderr.males <- c(0.07493673, 0.18479661)

names(means.males) <- c("No","Yes")
names(stderr.males) <- c("No","Yes")

# Error Bar Plot

library (gplots)

# Draw the error bar for female experiment participants:
plotCI(x = means.females, uiw = stderr.females, lty = 2, xaxt ="n", xlim = c(0.5,2.5), ylim = c(-1,1), gap = 0, ylab="Microworld Performance (Z Score)", xlab="Stereotype Threat", main = "Microworld performance over experimental conditions")

# Add the males to the existing plot
plotCI(x = means.males, uiw = stderr.males, lty = 1, xaxt ="n", xlim = c(0.5,2.5), ylim = c(-1,1), gap = 0, add = TRUE)

# Draw the x-axis (omitted above)
axis(side = 1, at = 1:2, labels = names(stderr.males), cex = 0.7)

# Add legend for male and female participants
legend(2,1,legend=c("Male","Female"),lty=1:2)


[1] Cumming, G., & Finch, S. (2005). Inference by Eye: Confidence Intervals and How to Read Pictures of Data. American Psychologist, 60(2), 170–180.

Labels: , ,

A reply from SPSS Inc.

SPSS Inc. replied to my open letter on the poor quality of SPSS 16 for Mac:

Dear Bertolt:

I want to acknowledge your email/blog and apologize for the inconveniences caused by SPSS 16.0 for Mac. Your bugs, issues and suggestions have been logged and we will work on fixing them in future releases.

We are getting ready to start beta testing SPSS 17.0 for all three platforms -- Windows, Mac and Linux -- in a couple months. Would you like to participate? We would love to have your input. Beta testers get a free copy of the final software.

Thanks,
Arik

________________________
Arik J. Pelkey
Sr. Product Manager
SPSS Inc.
Phone: [deleted]
www.spss.com


________________________________

I do acknowledge the friendly mail and the fact that they didn't try to reason or to justify certain issues. However, note that they apologize for the inconvenience SPSS caused, but not for the bugs, i.e. for the quality of their software. That may sound like splitting hairs, but to me, it's a difference. Why can companies never ever say something like: "We know we screwed up big time. We're sorry." Why does it always have to be some sort of marketing speech? Anyways, I do appreciate their invitation to their beta program which I am going to accept (criticism should always be constructive, eh?).

However, I also suggested two steps on SPSS's part in my reply: Firstly, SPSS should publicly acknowledge certain issues with SPSS 16 for Mac. Secondly, I urge SPSS to review their internal processes for software testing. A more rigorous product testing would have saved them and me a lot of time and nerves.

Labels: , ,

SPSS 16 for Mac Doesn’t Make the Cut

Mark Kupferberg took up my open letter to SPSS on his Blog. He agrees with me on the poor impression that SPSS 16 for Mac's UI creates:
I haven’t personally seen SPSS 16 for the Mac, but looking at the pictures Bertolt provided, I can certainly see why one might be concerned. It really does look like something that belongs on Windows 3.1.
Some of the people who commented on my rants defended the UI for two main reasons: First, it has been this way since the first release and its good that it stays the same, and second, that it's what Windows users get, too. I think that both views are flawed, because a good piece of software can improve its UI without turning its users away. Changes to the worse have to be avoided of course, but no changes at all just for the sake of stability doesn't sound like a sound argument to me. And secondly no, the Mac UI is not what Windows users are served. The icons on Windows look similar, but they're smaller, integrate better with the overall design and the entire UI makes a more organized impression on me:

Labels: , , ,

2008-03-28

Learning R for SAS and SPSS Users

For all of those who as frustrated with SPSS as I am, decisionstats has a great tip:
So you decided to cut down on your Statistical software expenses and decided to get R.

but the problem is you know SAS /SPSS and you need to learn R fast enough to justify switching over …….

the ideal book for you is http://oit.utk.edu/scc/RforSAS&SPSSusers.pdf

Labels: , ,

2008-03-27

SPSS 16 for Mac: Insulting users. An open letter to SPSS Inc.

Dear Ladies and Gentlemen at SPSS Inc,

As a psychologist working in experimental research, the statistical analysis of data is the bread and butter of my daily work. Like the majority of my colleagues in the social sciences, I use the de-facto industry-standard for this task: SPSS; the very product your company is bulit on, the very product that is supposed to deliver a "statistical package for the social sciences" - what SPSS originally stood for before it became a brand.

Let me remind you that this is an exclusive piece of software that comes with a steep price tag of $639 for the single base version for higher-education institutions ($1699 for commercial users).

I am writing you this open letter concerning the quality of your most recent version of SPSS for the Mac - the first version that runs on intel-based Macs, SPSS 16.0 for Mac.

SPSS 16 for Mac - that I have to use on a frequent basis - is the most insulting piece of software I ever came across. I have been frequently annoyed by software in my life time, but this is the first time that I actually feel insulted by a commercial piece of software. Its astonishingly poor interface design and the long list of bugs I discovered during a single week of intense usage make me wonder whether SPSS 16 for Mac was ever used for its intended purpose at your company before you dared to ship it to us - your end users and customers. Do you think that just because we're scientists, you can throw this half-baked crap at us?

The poor impression begins right after double-clicking the icon, when SPSS displays its spalsh screen:

Non-English characters, as they appear in the name of my organization (Universität Zürich), are not displayed correctly. Your programmers have obviously never heard of proper internationalization.

Secondly, the overall appearance makes me think its 1996.

Especially the tool-bar looks exactly like I would expect a toolbar to look like in a 1990s piece of cheap shareware:

I mean, honestly, is this some kind of joke? This interface does neither convey any informational value nor scientific professionalism (if that was intended). The only thing it conveys is your utter lack of interface design principals.

But apart from such minor issues (as you seem to think that UI design is a minor issue), the list of bugs in SPSS that I came across during a single week of working with SPSS 16.0 for Mac is mind-blowing.
  • Double-clicking a saved viewer output in the finder opens an empty data file instead. Double-clicking the output in the finder again leads to an error-message that tells me that the file is already open (which it isn't).
  • If I go through the cumbersome process of defining input parameters for a data file in text format, and save the parameters as a template for future imports, I cannot load the template the next time I want to use it. When I click on the template file in the open template-dialog, nothing happens.
  • If I select "Data... -> Merge Files -> Insert Variables", choose an external file and tell SPSS to add certain variables from that file to my current file while dropping others, the resulting syntax produces an error and nothing happens.
  • Importing variables with values that are stored in the decimal format (e. g. "4.023") from a text file produce missing values, i.e. they're not imported at all despite the fact that they're displayed correctly in the preview of the import wizard. Changing the variable type from numeric to string doesn't help.
  • The menu bar in the output viewer disappears from time to time. Only quitting and restarting SPSS brings it back.
  • When re-opening a saved viewer file, the font face of all custom-edited headlines is changed from Arial 16 to Times New Roman 12.
  • Overall performance is incredibly slow.
  • In the output-viewer, double-clicking a diagram for editing and closing it again sometimes leads to all changes being lost.
These are just the most prominent bugs I came across. I am sure that there is more where that came from. Do you have any kind of testing whatsoever at SPSS? What kind of impression do you think such experiences create? On my part, it creates the impression that you disrespect your users.

According to Eric Sink, there are three categories of software:
  • MeWare: The developer creates software. The developer uses it. Nobody else does.
  • ThemWare: The developer creates software. Other people use it. The developer does not.
  • UsWare: The developer creates software. Other people use it. The developer uses it too.
For me, SPSS is an extreme example of ThemWare. You seem to have no clue about the poor quality you're creating - at least for the Mac. This impression is extremely stark because I have to use your products alongside beautifully designed pieces of software such as bibDesk, Apple Pages, and Apple Mail.

In my opinion, there is a piece of statistical software that is just the opposite of SPSS: R. It doesn't sport a graphical interface such as SPSS (it's syntax only, like SPSS used to be), but it's certainly more powerful, creates better graphs, and is built and maintained by a community of people that care for their product and actually use it. I've been trying R alongside SPSS for six months now and I haven't come across a single bug. If R had a powerful graphical interface, your product would be off the market within a week.

My experience with SPSS 16 for Mac will make me change to R once and for all. Furthermore, I will encourage my colleagues to do the same.

Frustrated,
Bertolt Meyer

Note: The link to the three categories of software stems from Jeff Atwoods coding horror.

Update: Two more bugs I can reproduce:

  • Copy and Paste from Excel is not working
  • Importing Excel Files produces "?" as values after the 40th variable
Update 2: According to this sitemeter-entry, someone from SPSS has read this post. I wonder whether I will receive a reply.

Update 3: The story has been picked up elsewhere and SPSS replied.

Labels: , , , , ,

Gender Effects in Complex Problem Solving

I am really excited about my most recent experiment on gender effects in complex problem solving (CPS). Complex problems represent the type of problems that managers and politicians face on an everyday basis: A complex and dynamic (changing on its own over time) system needs to be transformed from a current state into an ill-defined goal state; the system is networked (i. e., tweaking at one screw will lead to unanticipated changes in other parts of the system) and multiple, possibly conflicting goals need to be pursued. The ability to solve such complex problems is tested with so-called "Microworlds", complex computer-simulations that place the gamer in a semantically framed complex problem scenario: A company needs to be saved from bankruptcy, a system needs to be steered within certain parameters, a forest needs to be catered for, an eco-system must be maintained. These microworlds run over a simulated period of time (usually several months), many variables can be tweaked, and decisions taken at an early step influence the further cause of the game. A bit like SimCity if you will. CPS performance is largely determined by certain factors of intelligence and by knowledge on the system in question. This knowledge is usually obtained during the problem solving process itself. Thus, the ability to identify connections, to understand systems, and to learn quickly is a key determinant of CPS.

Since complex problem solving is considered to be a core managerial competence, microworlds are frequently employed in assessment centers by large corporations such as banks and business consultants.

However, I came accross two studies that bothered to examine microworld performanc scores seperately for male and female experiment participants. All other studies I came accross did not report individual findings for the two sexes. I have an idea why: The two abovementioned studies reported gender effects in the direction that men outperform women. Those two studies (both from the 90s) explained the gender effect with higher intelligence levels of male participants and with higher levels of computer experience among male participants. If these artifacts were controlled for, the statistical difference between male and female complex problem solvers would vanish. That was not the case in the experiment I conducted in my PhD-thesis. I found severe gender differences in CPS performance, even after controlling for several variables: Intelligence, learning, computer experience, and economic knowledge. These variables were unable to explain the gender differences I found.

Now, one wouldn't say that women are poorer managers than men. At the same time, if my results hold true, the use of microworlds in assessment centers favors male applicants over female applicants. This sounded like an important issue to me and I decided to pursue the matter further.

My brilliant colleague Carmen Lebherz suggested the concept of stereotype threat to me when I told her about my odd findings. Wikipedia:
"Stereotype threat is the fear that one's behavior will confirm an existing stereotype of a group with which one identifies. This fear may lead to an impairment of performance."
In my case, the either explicit or implicit stereotype that women are poor in CPS (or in "computer-related stuff") may have impaired the performance of my female experiment participants. I designed an experiment in order to test this assumption. We employed a 2 x 2 x 3 between-subjects design: gender (male / female) x stereotype threat (yes / no) x microworld (Taylorshop / FSYS / ColorSim). Stereotype threat was altered by the instructions that the experiment participants received. In the stereotype threat condition, participants were told that we would measure their ability to solve complex problems with a complex problem solving microworld. We told them of the role microworlds play in assessment centers and asked them to do their best. In the non-stereotype-threat condition, we told them that we would like them to play a kind of computer game and that we would be interested in the emotions that this game would create (which we measured with Marx & Stapel's 2006 questionnaire).

The result: Over all three employed scenarios, female experiment participants exhibited much poorer performance under the stereotype threat condition than under the non-stereotype-threat condition, as the graph below illustrates (standardized CPS performance is indicated on the y-axis over all three microworlds).

The weird thing is that this happens both to women who think that men do better in microworlds and to women that do not think so, i.e. the effect of stereotype occurs regardless of the salience of the stereotype.

Further analyses of covariance will hopefully shed more light on the conditioning factors of these effects (we measured motivation, frustration, anxiety, intelligence, experience with computer-simulations only to name a few). However, this is a compelling example for the role of the situation and setting on human performance.

I will try to write up a paper on our findings as soon as I finish data analysis. In the meantime, I would like to thank my collaborators and the people who enabled this experiment: Heinz Gutscher for the generous funding and the tremendous working conditions at his group, Jürgen Boss for adapting ColorSim for use in my expriment (during his xmas holidays!), Annette Kluge for providing me with Jürgen's taylor-made version of ColorSim, Dietrich Wagener for providing a copy of FSYS, my students Jeanine Grütter, Marisa Oertig, and Rahel Schuler for their great efforts in conducting the experiments (179 participants in the lab in six weeks!), and finally our great and willing participants.

Copyright for the first two above images obtained from www.istockphoto.com. Reproduction is prohibited.

Labels: , , , , , ,

2008-03-13

hi-speed wifi

I always thought that the German ICE is the top of the food chain when it comes to bullet train technology. Dubbed as 'Maybach on rails' by Deutsche Bahn, I was convinced that hi-speed train traveling couldn't get any better. Comfortable, great design, and laptop power outlets at every seat.

Well, thats nothing. It's nothing compared to the latest french TGV lyra that I had the pleasure to ride last weekend. 4 hours and 20 minutes from Zurich Central to Paris Est (at a steep 200 EUR return). The train was well designed, nice colors but it didn't feel quite as luxurious and solid in comparison with the ICE. Until I booted my laptop, detected a Wifi and found this:
The train apparently has its own web server that displays the above site. In the center is a flash-based live map that displays the current location of the train. The box on the top left displays the current speed (up to 340 km/h) and the completed percentage of the journey. The best thing however is the third menu-item on the left: Internet! During approximately 2 hrs of the journey (and within France), that link fires up a free internet connection (realized by orange mobile). I checked my emails and surfed the web at decent speeds. For free.

Fuck you Deutsche Bahn, the holy grail of coolness now resides with the french, and an overpriced WLAN-Service on one lousy connection that is only available to first-class passangers can't stick up with that.

Hi-Speed trains rock: You travel from center to center, no airport annoyances (check-in, security, weight limitations, waiting at gates, endless journeys to shitty airports and god-knows-what), its much better for the environment, and there's power for laptops. If you ask me, it's the best way to travel - at least in Europe.

Paris was fantastic by the way.

Labels: , , , , ,