Monday, April 14, 2008

Parallel programming? Well, it’s all about CPU affinity or how to set processor affinity in WPF

Parallel computing is very cool technology, that makes you able to leverage tasks between processors in your system. Today it’s already impossible to buy single processor computer – when you’ll buy new PC, you’ll probably get dual core CPU at least. Today we’ll speak about how to manage affinities of tasks between CPUs in your system. For this purpose we’ll create a small game, named CoreWars. So, let’s the show begin.


What we want to do? Simple game, that includes some complicated math, executed large number of times. Here the code of one Dice roll

void SetDice()
            List<Dice> dices = new List<Dice>();
            for (int i = 0; i < MaxTries; i++)
                double x = rnd.NextDouble();
                double y = rnd.NextDouble();
                double g = (Math.Pow(x, 2) + Math.Pow(y, 2) - rnd.Next()) / (Math.Sqrt(2) * y + Math.Sqrt(2) * ((x - Math.Sqrt(2)) - 1));
                dices.Add((Dice)((int)g & 0x3));
            int mP = dices.Count(dice => dice == Dice.Paper);
            int mR = dices.Count(dice => dice == Dice.Rock);
            int mS = dices.Count(dice => dice == Dice.Scissors);

            int m = (mP > mR) ? mP : mR;
            m = (m > mS) ? m : mS;

            CurrentDice = (Dice)((m & 0x3) % 0x3);

Don’t even try to understand what’s going on here. The result is one dice state Paper, Rock or Scissors (the old kinds game). Now we want to run this number of times and affine each thread to one of our system’s core. How to do it?

Simple way – ask what thread are you running, then use ProcessorAffinity property of the ProcessThread to your CPU id.

ProcessThread t = Process.GetCurrentProcess().Threads.OfType<ProcessThread>().Single(pt => pt.Id == AppDomain.GetCurrentThreadId());
t.ProcessorAffinity = (IntPtr)(int)cpuID;

Very good, but what’s this warning about “System.AppDomain.GetCurrentThreadId()' is obsolete: 'AppDomain.GetCurrentThreadId has been deprecated because it does not provide a stable Id when managed threads are running on fibers (aka lightweight threads). To get a stable identifier for a managed thread, use the ManagedThreadId property on Thread. ” ? Why this happens to me? The simple answer is Thread Processor Affinity never was most reliable thing in Windows (as well as ThreadID). But we still want to make it. What to do?

Let’s go a bit unmanaged. We’ll use SetThreadAffinityMask and GetCurrentThread WinAPI methods to achieve what we want to.

        static extern IntPtr GetCurrentThread();
        static extern IntPtr SetThreadAffinityMask(IntPtr hThread, IntPtr dwThreadAffinityMask);

SetThreadAffinityMask(GetCurrentThread(), new IntPtr(1 << (int)cpuID));

SetThreadAffinityMask(GetCurrentThread(), new IntPtr(0));

So far, so good. Now threads. This step is very trivial

ParameterizedThreadStart pts = new ParameterizedThreadStart(dr.ThrowDices);
Thread t = new Thread(pts);
t.IsBackground = true;

We need also context switch (you remember, WPF)

                int w = WaitHandle.WaitAny(wcs);
                Dispatcher.Invoke(DispatcherPriority.Background, (SendOrPostCallback)delegate(object o)
                    tb.Text = string.Format("The winner is CPU #{0}", o);
                }, w);

Now, what to do with binding? To Bind or not to Bind – that is the question! My answer is to bind! Remember, you need also a small piece of CPU time to process UI events and push frames in renderring thread. Let’s roll dices.

SetThreadAffinityMask(GetCurrentThread(), new IntPtr(1 << (int)cpuID));

            App.Current.Dispatcher.Invoke(DispatcherPriority.Background, (SendOrPostCallback)delegate

                if (CollectionChanged != null)
                    CollectionChanged(this, new NotifyCollectionChangedEventArgs(NotifyCollectionChangedAction.Reset, null));
            }, null);

            for (int i = 0; i < MaxIteractions; i++)
                DiceRoller dr = new DiceRoller();
                App.Current.Dispatcher.BeginInvoke(DispatcherPriority.Background, (SendOrPostCallback)delegate

                       if (CollectionChanged != null)
                           CollectionChanged(this, new NotifyCollectionChangedEventArgs(NotifyCollectionChangedAction.Add, dr));
                   }, null);
            SetThreadAffinityMask(GetCurrentThread(), new IntPtr(0));

Also do not forget to implement INotifyCollectionChanged on result collection and INotifyPropertyChanged on dice object.

After all preparations done, we can start to build basic user interface for the presentation layer. Main window

<StackPanel Name="Root">
        <StackPanel Name="LayoutRoot" Orientation="Horizontal"/>
        <Button Content="Start battle" Click="Button_Click"/>
        <TextBlock Name="tb" TextAlignment="Center" FontSize="15" Visibility="Collapsed"/>


<DataTemplate DataType="{x:Type l:DiceRoller}" x:Key="drt">
            <Rectangle Width="5" Height="5">
                            <DataTrigger Binding="{Binding Path=CurrentDice}" Value="Paper">
                                <Setter Property="Rectangle.Fill" Value="Red"/>
                            <DataTrigger Binding="{Binding Path=CurrentDice}" Value="Rock">
                                <Setter Property="Rectangle.Fill" Value="Black"/>
                            <DataTrigger Binding="{Binding Path=CurrentDice}" Value="Scissors">
                                <Setter Property="Rectangle.Fill" Value="Blue"/>
        <ItemsPanelTemplate x:Key="wpt">

We done. the only task to perform is to detect the number of CPUs in the system and put appropriate number of content controls to show the race result. Do not forget binding too.

pc = Environment.ProcessorCount;

            for (int i = 0; i < pc; i++)
                HeaderedContentControl hcc = new HeaderedContentControl();
                hcc.Width = this.Width / pc;
                hcc.HorizontalAlignment = HorizontalAlignment.Stretch;
                hcc.Header = string.Format("CPU {0}",i);

                ItemsControl ic = new ItemsControl();
                ic.ItemsPanel = Resources["wpt"] as ItemsPanelTemplate;
                ic.ItemTemplate = Resources["drt"] as DataTemplate;
                hcc.Content = ic;

                DiceRollers dr = new DiceRollers();
                this.Resources.Add(string.Format("roller{0}", i),dr);              

                Binding b = new Binding();
                b.Source = dr;
                b.IsAsync = true;
                b.BindsDirectlyToSource = true;
                b.Mode = BindingMode.OneTime;
                ic.SetBinding(ItemsControl.ItemsSourceProperty, b);

That’s all. Now we can race our dices and see what CPU works better. For real life scenario, this sample looks absolutely stupid, but just thing about processor affinity in Parallel Extension CTP – why not to make us able to use it? The answer is – K.I.S.S (keep it simple, stupid). And as for me, I do not like to be stupid and I want to manage affinities! I need high performance for my applications and only me can decide to which processor dispatch which task and how.

Have a nice day and be good people.

Source code for this article.

No comments: