Shawn Blais taught me all I needed to know about mobile optimizations of my AS3 code. His blog only has a dozen or two articles, but they are chock-full of interesting information (and even sales figures!). The most useful stuff for me, at this point, is the graphics pipeline optimizations.
The thing that ties the following three steps together is one unifying theory: use bitmaps for everything. You can start with MovieClips and vector Sprites, and you can even stick with Flash’s DisplayList to keep things organized. But the actual image data? Bitmaps! Always bitmaps.
Step One: Use the GPU rendering mode
When designing a mobile application, you’ll have an “application.xml” (or similarly named) file that contains all sorts of nice settings. One of those is going to tell the mobile device whether to render using the CPU or the GPU. Most defaults (including the FlashDevelop template file) will point you to CPU, and that may be fine for flash’s standard vector art. Switching to bitmaps and the GPU setting will give us much better performance.
Open up application.xml and make sure this exists:
<initialWindow> <renderMode>gpu</renderMode> </initialWindow>
(source article is same as step 3)
Step Two: Lower the Stage Rendering Quality
AS3′s stage-rendering quality setting determines how vector art (probably your “Sprite” and “MovieClip” classes) is rendered. The thing is, even with a low setting, the stage still respects your bitmaps “smoothing” flag and draws it without any discernible difference. No need to spare the CPU cycles on something we aren’t using!
stage.quality = LOW;
If you have vector art you are loading and converting during runtime (see step 3), AS3 even lets you change the stage quality on the fly! Just use this:
stage.quality = HIGH; convertMySprite(); // Or whatever your function is stage.quality = LOW;
Step Three: Use Bitmaps, and Cache them
This is probably the best performance-enhancing-drug my mobile apps have used so far, but it only gets the big performance gains if you use it in conjunction with Step 1 (GPU render mode).
The basic idea is to take all of your image data, and cache the bitmap data only once - dynamically – to a dictionary reference. In GPU render mode, this stores the data as a texture in GPU memory on the mobile device. As long as all duplicate images are pulled from the original data, no new memory is used and creation of new graphics is lightning fast.
This works particularly well for common images used frequently – say, badguys, bullets, and common tiles. But I use it for everything!
Shawn’s original article laid out some source code and a longer explanation if you want to get into details and performance charts. His code does all of the conversions automatically for you, and the discucssion in the comments of his article improved upon it. I added a few tweaks myself, and it is now the only class I use for any type of image data. Imported .PNG file? Sprite or MovieClip in a .SWC? Class reference to an object you custom made? Doesn’t matter! All automated, all quick, all easy to use. Best of all: the code is really short, simple, and easy to read in about a minute.
It’s a bit too long to paste here, so here’s a link to the class I use right now. Feel free to use it, just let me know if you improve on it :) Copy and paste it to the root of any project and you should be able to start using it right away.
Bonus Step: Convert MovieClips on the fly
I haven’t tried this step out yet, but it’s an extension of the class I offered up in Step 3: automatically convert each frame from a MovieClip to cached bitmap data (and store that stuff in the GPU). If I had animations in my most recent games, I would be all over this too!
The biggest drawback I’ve found with the GPU-rendering mode is that you lose your runtime access to all filters and blendmodes. If you want to put a nice glow filter on a flame graphic, it’s easy to pre-bake that. But dynamically adding things – like adding a stroke to a font – are pretty hard to workaround (using system fonts, as opposed to spritesheet based font rendering). Hit me up if you have trouble with these, I’ve found some solutions (but not all).
A secondary detriment is that a complex DisplayList will slow down your app a bit more than usual. Try to avoid nesting that goes too deep. But you were doing that already, right? ;)
A compounded drawback here is that GPU-rendering isn’t available on your desktop (without Stage3D routines). That means when you do a test-run of your mobile app on your desktop, you will still see your filters and blendmodes, which can lead to some confusion, particularly if you’re porting a complex app written for desktop use.
The most complex scenes I’ve created with these methods run beautifully on my two-year-old Nexus One. I ran an hours-long performance test on my iPad – constantly drawing new moving objects to the screen (and never removing them). Though things (understandably) dropped to about 3FPS after quite a while – memory usage was nominal and the app never came close to crashing.
Most importantly, and this is a pretty big thing: this performance boost is BETTER than what you get from blitting on a mobile device. That means following those 3 basic steps will give you better performance than Flixel and FlashPunk’s default rendering engines.
In many cases, it’s possible for your mobile devices to outperform your desktop thanks to the direct GPU access. Hopefully this will change with Stage3D? Until then, blitting engines (like Flixel and FlashPunk) are probably your best bet for desktop and browser performance.
For all sorts of performance charts and figures, stress tests and numbers, check out Shawn’s posts (linked in the respective steps above). He sees up to a 4000% speed boost.