Thursday, 24 June 2010
An exciting day in the life of a scientist
or: How to kill a few hours trying to get publication quality figures out of Matlab This is really just a boring moan; blogging as therapy. Well, yesterday I had the excitement of getting proofs for an 'in press' article. Virtually no errors to be corrected, but what was this list of queries from the publisher? Ah, the figures. Resolution too low. Well, that should be easy to fix - I'd do it first thing. Or so I thought. 9.30 a.m. The paper has an unusually large number of figures, eight, some in colour. All were created in Matlab and saved as .tiff format. I was pretty proud of generating the figures in Matlab. Graphics in Matlab is a bit of a nightmare and takes some time to learn, but once learned, you can generally create figures that are more complex than those produced by the other applications that I know. The proofs have come from Developmental Science, but they tell me the figures are too low resolution, even though I'd selected a 'no compression' option when saving them. Coincidentally, I have another article that is under consideration by Journal of Neuroscience, who also mention their stringent requirements for figure quality, and point me to a website, Cadmus, that will explain what is required and how to do it. Oh good, I think. Someone will help walk me through how to get good quality figures. Ha ha ha. Cadmus has a list of programs and formats that are supported. Alas, Matlab not among them. But Adobe Illustrator is. We have a copy of that. I used to have it on my machine, but uninstalled it, because I never used it and I got fed up when graphics files defaulted to opening in it, which took ages. Tracked down the CD, reinstalled it (compact version). Right, I think, Matlab will allow me to export a file in .ai format, and then I will be OK. Ha ha ha 9.45 I start with the simplest figure I have – a simple black and white line drawing with a couple of text labels. I save it with .ai format. When I click on it, Adobe Illustrator tries to open it, but first tells me it has 'unrecognised fonts' (Arial?) and then says it can't open it. OK, I think. But I can open an .eps or .tiff file in Adobe, and I can also save my Matlab figure in those formats. But once again, I get strange messages about wrong fonts, and for the .eps version, what appears on the screen is unrecognisable from the original. I look again at what Cadmus says about Adobe Illustrator. Oh dear. "PLEASE NOTE: When creating graphics in illustration programs such as Adobe Illustrator with the intention of outputting to an imagesetter or platesetter, it is extremely important that the person creating the illustration have a thorough understanding of the details of imaging in a prepress environment. There are an abundance of complex problems that can occur at output if paths are set up improperly, colors are indicated incorrectly, or other elements are constructed improperly. Trapping issues can also present problems if not addressed. The more complicated your illustration becomes, the greater the probability of problems at output, and therefore the need for more expertise and experience in creating the files." Decide that I had better try another option, since I have never used Adobe Illustrator and would not recognise a platesetter if I stumbled over one. But, I think, there is a helpful application associated with Cadmus that allows you to check your files. And I can just open my .eps file from there. Having gone through the usual round of registering, thinking of a password, getting email confirmation of the account, etc. I am in to 'Rapid Inspector'. I try opening my .tiff file. FAIL says Rapid Inspector. Resolution too low. OK, how about .eps version? Ah, says Rapid Inspector"Rapid Inspector found an image with CMYK color. CMYK color is not supported. Acceptable color space include(s): Spotcolor, Lineart, Grayscale, RGB." But this is a black and white figure! I spend some time in Matlab trying to sort this one out, but with no success. My own fault, but I can't find the script that I made to generate the figure in the first place, and I will need to redo it with different fonts etc. So waste 10 mins tracking it down and resolving once again always to save my programs in sensible places with sensible names. I go on to the web to find out how to change the colormap to gray. Re-run program, save the figure, and try it again in Rapid Inspector. It still tells me I have CMYK color. It also complains about my fonts. "Rapid Inspector detected that some or all fonts are missing from this file. To pass inspection, all fonts must be embedded. The following fonts are not embedded: Helvetica. " That's odd, as I was using Arial, not Helvetica. Try a few more runs of the program with different fonts. It still doesn't like my fonts. 10.45 Time to do a Google search about how to save a Matlab figure with embedded fonts. Well, it is nice to know I am not alone, and that many others have had this problem over the years. Several complain that it is about time Matlab did something about it. One helpful person, Oliver Woodford, has written a routine called export_fig, which is freely available: http://www.mathworks.com/matlabcentral/fileexchange/23629 Excellent. But, he explains, if you want to use it to create the kinds of files I need, you need to download two other applications from other sources. Fortunately, I already have the first, but the second, xpdf, is one of those applications that makes the non-geek's heart sink when you go to the download webpage and find, instead of clear instructions about what to do, a whole list of possibilities. I fear that the one I probably need ends in .tar.gz. I've tangled with these things before but can never quite remember what to do with them. 11.30 After a bit of fiddling about, I save the .tar.gz file, then try to extract the contents. A few failures as I do something wrong, and then at last I have it. But I am not sure I have it in the right place and no indication is given as to where it should be saved. I've just stuck in my Matlab program folder. 12.00 OK I should be all set, so now let's look at the examples of how to use export_fig. Nice helpful man who wrote the script clearly has been through everything I have, and more. He writes: "Exporting a figure from MATLAB the way you want it (hopefully the way it looks on screen), can be a real headache for the unitiated, thanks to all the settings that are required, and also due to some eccentricities (a.k.a. features and bugs) of functions such as print. The first goal of export_fig is to make transferring a plot from screen to document, just the way you expect (again, assuming that's as it appears on screen), a doddle." This is looking more promising.... Print out the instructions – 13 pages of them. 12.30 Took a break to look at some interesting data: what I ought to be doing instead of this rubbish. 14:30 OK back to export_fig. First attempt failed. Matlab can't find export_fig. I need to put the script somewhere else. OK,eventually sorted that by putting all the export_fig m files onto the Matlab folder in My Documents. All going very well so long as I am exporting to .png format. But I want .eps. When I try that the program complains it needs pdftops . SO where the hell is that? I will have a hunt. Found it, but it is a .cc file. does not seem to be recognised by matlab. So I have now spent more time on the website looking for a .m file. Doesn't appear to be one. Gave up and decided to try a .tiff file. Hah! nasty bossy Rapid Inspector says PASS. Hooray! But turned out I was reading in a different file created with the same name in May. Back to my .tiff option in export_fig This fails resolution test, even though it is specified as max quality. 15.00 Have a cup of tea. Back to trying the .eps option. Can't work out how to use the xpng file. Program stops and asks for pdftops. I have located pdftops.m and pdftops.cc but neither seems what it wants. As far as I can see from looking at the code, it wants an .exe file. The web tells me that a .cc file is a C++ file. In some desperation I tried renaming the .cc file as .exe, but that did not work. Decide to write to the author of the script, having read all the comments on the program and found that nobody else is having problems. Send the email. It bounces. I had mistakenly included a full stop at the end of the email address. Try to resend to correct address: email keeps autocompleting to the address with the full stop. After 2 tries, get into 'frequent contacts' in address book and delete entry so can now send the email. 15.35. I need another cup of tea to calm down. So now trying figure 2 , coloured headplot. Already have as .tiff; it looks very nice. Rapid inspector tells me FAIL! resolution is too low. I try saving as .png. Get lovely looking picture. Rapid inspector won't read it. Try exporting from microsoft image reader to .tiff and then reading in. Now I get: "alpha_planes: Rapid Inspector found extra color channels within this image. Extra color channels are also known as Alpha channels. Alpha channels are not supported. Please use an image editor to remove alpha channels from this file.resolution: Rapid Inspector found a low-resolution (RGB) image (96 DPI). The minimum required resolution for this type of image is 300 DPI. " 16:30 The wonderful Oliver Woodford replied and explained patiently how to cope with the pdftops thing. I downloaded. Still did not work. Downloaded again to the location he had said his was in . It works!! And the figures it creates are acceptable to the wretched Rapid Inspector. Verdict I'm really grateful for those who have produced free software that helped me deal with this. But I am really annoyed on two counts. First, Matlab is an expensive package. It does wonderful things and I love it to bits as a programming tool, but its graphics are not easy to use. People have been complaining for at least 2 years about the difficulty of generating high resolution output, yet nothing has been done. It should be high priority for the Matlab developers to fix this so that there is a simple command to generate this kind of output. Second, the Journal of Neuroscience exemplifies a trend in many journals to make authors do a lot of work that would, in the old days, have been done by copy-editors and other professionals. Scientists are supposed to have skills in graphic design and programming on top of all their other accomplishments. Some journals do still accept figures in a range of formats and look after any conversion from their end. But increasingly, the onus is put on authors. There appears to be no correlation between the wealth of a journal and the amount of help it will give to authors – in fact, if there is a correlation, I suspect it is inverse. Journal of Neuroscience charges hefty fees for just submitting a paper, let alone publishing it, with added costs of $1000 per figure unless first and last authors are members of the Society for Neuroscience – we are not and we have lots of colour figures. So our grant will be spent on shoring up J. Neuroscience rather than employing a vacation student for a few weeks. I reckon that on a 1-10 scale of geekiness I am a 6-7, and I am struggling. I am a full-time researcher with good support. I am a reasonable programmer. But I've got lots of colleagues who are trying to produce papers who are closer to a 1 or 2 on the geekiness scale, have little or no support, and are trying to fit research around busy teaching commitments. How on earth can they cope with all of this?