Stuff Michael Meeks is doing |
Older items: 2023: ( J F M A M J ), 2022: ( J F M A M J J A S O N D ), 2021, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, legacy html
// load and arrange 8x pixel chunks & a shifted version vpermd %ymm0,%ymm1,%ymm5 vmovdqu 0x20(%rdi,%r10,4),%ymm0 vpand %ymm5,%ymm2,%ymm5 vpermd %ymm0,%ymm3,%ymm6 vpand %ymm4,%ymm6,%ymm6 vpor %ymm5,%ymm6,%ymm5 // compare and turn that into a bitmask of duplicates vpcmpeqd %ymm0,%ymm5,%ymm5 vmovmskps %ymm5,%eax // store that bitmask (win was here) mov %al,(%rcx) mov %eax,%r11d not %r11b shl $0x5,%rax // pack only new pixels into memory vmovdqa (%rax,%r9,1),%ymm5 vpermd %ymm0,%ymm5,%ymm5 vmovdqu %ymm5,(%r8) movzbl %r11b,%eax popcnt %rax,%rax lea (%r8,%rax,4),%r8 add $0x8,%r10 inc %rcx cmp $0xf8,%r10 jb // loop to top
// simplified inner loop for 64 pixel chunks for (; bitToSet; ++x, bitToSet <<= 1) { if (from[x] == lastPix) rleMask |= bitToSet; else { lastPix = from[x]; scratch[outp++] = lastPix; } }
My content in this blog and associated images / data under
images/
and data/
directories are (usually)
created by me and (unless obviously labelled otherwise) are licensed under
the public domain, and/or if that doesn't float your boat a CC0
license. I encourage linking back (of course) to help people decide for
themselves, in context, in the battle for ideas, and I love fixes /
improvements / corrections by private mail.
In case it's not painfully obvious: the reflections reflected here are my own; mine, all mine ! and don't reflect the views of Collabora, SUSE, Novell, The Document Foundation, Spaghetti Hurlers (International), or anyone else. It's also important to realise that I'm not in on the Swedish Conspiracy. Occasionally people ask for formal photos for conferences or fun.
Michael Meeks (michael.meeks@collabora.com)