If you follow me on Twitter you know that last days Mario, Nicoptere and me were trying to discover the fastest Median Filtering solution for Flash. Our first attempts were around 7 secs per 350×360 image size with small kernel 7×7! You may think that this is very slow, but Median Filtering is slow in all environments cause of its architecture. Anyway there is a constant time solution described here that will result in constant processing time for any radius. It is also mentioned that this is the fastest possible solution for now!
So no surprise all of us tried to implement it. After some unlucky results Nicoptere decided to leave the scene but this is all because of lack of time I think! Mario as he always do start trying to implement Linked Lists to make the process as fast as it possible. I’ve to say that Mario’s version is the fastest for now except one condition I describe later.
As for me I started with the simple Vector arrays and you will see this code inside my sources. This approach wasn’t very fast but stable for different kernel sizes. And the final version created using Linked Lists as Mario recommended. As it turns out that my version runs a little bit slower then Mario’s at small radius but performs faster when we extend filter radius size (starting radius 7).
I’ve merged my Histogram class with Mario’s filter applying structure (it is faster and more accurate) what results in faster and a lot more stable filter results. Now the speed doesn’t grow upon radius but it performs event faster with larger radius!
Added ByteArray based method using Joa’s TDSI tool. It performs faster at small radius but upon radius grow Linked List approach may become faster.
I’ve also added another ByteArray based method that works amazingly fast! Unfortunately this method will have unexpected results starting radius 16 in worst cases due to a tricky operation of packing integers in to single entry. I put it in MedianFilter Class under constantTimeBALimit name. But it is not included in demo application.
Pure coded Linked Lists: 7×7 kernel: 0.8-1.17 sec; 101×101 kernel: 0.7-1.14 sec
ByteArray with TDSI: 7×7 kernel: 0.7-0.86 sec; 101×101 kernel: 1.0-1.17 sec
ByteArray with TDSI (Limited): 7×7 kernel: 0.43-0.51 sec; 101×101 kernel: 0.6-0.74 sec
(all tests done using standalone non-debug Flash Player version, Mac and PC)