Jani Joki, Futuremark’s Director of PC Products and Services, talks with TPG about all things benchmarking. You will learn how MadOnion became Futuremark, what exactly goes into creating the models and art style of 3DMark, the idea of benchmarking a system before you buy plus much more.
Please tell us a little bit about yourself and your role with the development of 3DMark.
My name is Jani Joki and I have overall responsibility for the development of benchmarks at Futuremark. For each benchmark project we have a lead programmer, who creates the technology behind each test, and a lead artist, who leads the creation of all the art assets. Futuremark is a pretty small company. Right now, we have around 30 people split evenly between developers, engineers and artists, plus a handful of admin, marketing and sales roles.
How did you get started PC gaming industry?
I’ve always been a computer enthusiast. I cut my programming teeth on a C64 and was a big Amiga fan for a long time. At college, I met one of the founders of Remedy, and later worked there in the late 90s. At the time I was (well, still am) a fan of Unix and I worked as the system administrator.
I was drafted into Futuremark at the beginning of 2000 as the IT Manager. Since then I’ve been responsible for a variety of things, mostly concentrating on the end-user functionality of our website, until being made responsible for development at large a few years back.
Can you speak on how MadOnion first began and the maturation process which has resulted in what we see today from Futuremark?
Way back in 1997 or so Remedy created Final Reality, which was a bit of a hybrid of a ‘demo’ and a benchmark. This was at the time when the PC was transitioning from software rendering to hardware acceleration. Final Reality was a huge hit and Futuremark was founded as a spin-off company for the sole purpose of making similar benchmarks. Futuremark grew with the dot-com boom, and then shrank as the bubble burst; but throughout our 15 years we have kept on creating benchmarks. The Futuremark credo from the start was to create benchmarks that were easy to use, impressive to look at and which showcased the latest technology.
In the early days the differences between video cards were huge. There were lots of different vendors and gamers didn’t know which features would give the best gaming experience. This gave 3DMark an excellent start.
Though we started as Futuremark, we changed the name to MadOnion.com during the dot-com boom, as it was ‘the thing to do’. I don’t remember any specific meaning in the name, other than an attempt to be memorable, which it most certainly was. As the dot-com era faded we switched back to Futuremark as we increased our focus on business customers. As memorable as it was, MadOnion wasn’t a name that rang true in the board rooms of large corporations. When we started we were (mostly) a company of twenty-something’s – young and energetic, very tech-savvy but lacking experience in project management and product development. A lot of our success was down to individual talent and the drive to excel – qualities born out of Finland’s strong demo scene.
While modern agile project management methods are not that dissimilar to the organized chaos of the early years, there’s definitely more organization and less chaos these days. Back then, you could go to the office at 4am on a Sunday morning and they’d be someone coding away. These days that only happens near benchmark launches.
What are some of the successes and failures you learned from in developing 3DMark?
Our first benchmarks were created ‘blind’, meaning our developers just put in the features and effects that they felt were correct. At that time we had few industry contacts and as we were doing things that hadn’t been done before, there was not much public information to rely on either. Our test lab for the first benchmark was perhaps five different machines, with a couple of graphics cards for each. Nowadays, we have an initiative called the Benchmark Development Program (BDP) that includes AMD, Intel, Microsoft, NVIDIA and many other world-class technology companies. The BDP helps us create relevant and impartial benchmarks by working in close co-operation with program members.
BDP members are involved in the planning and development of each new 3DMark from the initial technical specification to the final public release. In return, we get access to their hardware expertise and an insight into their visions for the direction they expect their technology to take in the coming years. It’s a very collaborative process and enables us to make the best possible benchmarks.
Having the best benchmarks is one thing, but in recent years we have learned that 3DMark needs to produce more than a number. We now put a lot of effort into creating online services to help people understand and compare their scores. For instance, we use our data to create lists of the best graphics cards, processors, motherboards, SSDs and mobile devices. If you are planning on building a new PC, we have tools that will estimate its 3DMark score before you buy the parts.
In its current form, how close is this version of 3DMark to your initial vision?
3DMark is a good match to our vision for the technical aspects. The latest version is our first cross-platform 3DMark and ideally we would have released it on Windows, Windows RT, Android and iOS simultaneously. The reality is that each new platform brings its own challenges and delays, and as we are a small team, we had to be realistic and launch on one platform at a time.
That said we are really only at the beginning of our vision for 3DMark. Part of that vision is apparent from the name for the latest version. It’s just “3DMark”, not 3DMark 2013, or 3DMark 12 or some other variation.
The reason is that instead of releasing a new version of 3DMark every couple of years as we have in the past, we plan to add new tests to this version over time. You can think of 3DMark as the container for specific benchmark tests. So far, we have Ice Storm, Cloud Gate, Fire Strike and Fire Strike Extreme. We’ll soon be adding Ice Storm Extreme to the Windows Edition, (it’s already in the Android Edition), and we will be adding even more tests in the future. We’ll be using the same approach for Android, iOS and elsewhere – updating the apps with new tests, instead of releasing new apps.
Please talk about developing the art style, 3D models, levels and various creative assets for 3DMark.
We know that many people like to run 3DMark just to see the latest graphical effects in action, so the art in 3DMark is really important to us.
Creating the theme for a new test starts with the technology. What new effects are possible in the latest API and what kind of setting would be best for showcasing them? For example, DirectX 11 introduced hardware tessellation. This tech can be demonstrated by smoothing out the edges on rounded objects, by adding texture to flat surfaces, or by increasing the level of detail as you get closer to an object. Each of these uses can suggest an interesting setting for the artists to work with.
3DMark Fire Strike, which is our latest high-end benchmark test, uses a multi-threaded DirectX 11 engine that supports tessellation, ambient occlusion, volumetric illumination, particle illumination and a variety of lighting and shadowing techniques. With post-processing, the engine can produce depth of field, bokeh, particle based distortion such as heat haze, lens effects, bloom and anti-aliasing. One of the coolest new tech features in 3DMark Fire Strike is the smoke simulation, which is calculated on the GPU using Compute Shaders. The simulation uses grid-based fluid dynamics to allow smoke and particles to react realistically to other physical objects in the scene, swirling into vortices as the characters fight through physical modeling rather than canned animation.
We have five artists in our team and they create all the assets using common 3D modeling and texture painting tools. They work closely with the programmers to ensure that the engine includes the tools they need to bring the scene to life.
Tell us about your relationship with Steam and process of submitting Futuremark products.
We love working with Steam. Our contacts at Valve have always been very helpful and supportive. We started talking to Valve about bringing 3DMark to Steam a couple of years ago. What we didn’t know at the time was that they already had plans to create a whole category for software on Steam. When they were ready the conversation started again and 3DMark 11 and 3DMark Vantage were chosen as launch titles for Steam’s software category. It has been really great to bring 3DMark to the millions of Steam members. The latest version of 3DMark even has Steam achievements.
How did you arrive at the price points for 3DMark?
There will always be a free version of 3DMark that anyone can run to test their system and get a score. (We tried launching 3DMark Vantage without a free edition a few years ago and our community made it very clear what they thought of that!) On Windows, the free version is called 3DMark Basic Edition. On Android, 3DMark is free from the Google Play store, and the iOS and Windows RT editions will also be free.
3DMark Advanced Edition includes additional features and settings for gamers, overclockers and sells for $24.99, which is a little more than previous versions but is also our first price increase in over 10 years. We also offer 3DMark Professional Edition, which is designed to meet the needs of the press and our business customers.
How important is it to get instant feedback about 3DMark from users through online message boards and other social networking sites?
Feedback from users is really useful, even if it can be overwhelming at times. Benchmarks are complicated things, designed to push hardware very hard. With the huge variety of system configurations out there, testing is a real challenge and it is almost inevitable that unexpected problems will come up from time to time.
We try to fix any issues quickly. Our QA guy, who also handles customer support, has expert knowledge of PC hardware and is usually able to help people solve their problems, even if 9 times out of 10 the cause lies with the user’s PC rather than our software.
How much value do you place on the opinions of those who review 3DMark professionally?
Members of the press are really important to us since their opinions and reviews can be very influential. For example, a key factor in our decision to make the new 3DMark cross-platform was hearing from editors who were desperate for a decent benchmark that would allow them to compare Windows, Android and iOS devices directly.
We provide testing advice and technical support to press and try to make the reviewer’s job easier by providing great benchmarks that give reliable and accurate results. 3DMark is used by hundreds of magazines and websites and we are always very proud when our benchmarks are used to review the latest hardware and devices.
How do you respond to those who feel synthetic benchmarks are meaningless in properly setting up expectations for the ability of games to run on various systems?
Firstly, I disagree with the suggestion that 3DMark is a synthetic benchmark. I would use that term to describe benchmarks that don’t reflect a real-world use case.
3DMark is designed very carefully, and with a great deal of input from companies like AMD, Intel and NVIDIA, to reflect the processing demands made by modern games. The fact that 3DMark is not an actual game is irrelevant to its purpose of measuring the performance of hardware under game-like loads. If there is one big difference between games and 3DMark, it is that by design, 3DMark is much more demanding since this is the best way to demonstrate differences in performance and also ensures that the benchmark will continue to produce relevant results over time as hardware improves.
With very few exceptions, games are far too lightweight to use as meaningful performance tests. If a game runs well on today’s systems, in six months’ time the latest hardware will walk all over it. What’s more, since games are designed to be widely compatible, they rarely use all the features available in DirectX 11. It’s also true that using a game as a benchmark will not tell you how well other games will run, since every game is different. It quickly becomes cumbersome trying to test lots of different games and working out some sort of average performance indicator from all the different results. Nor is it easy to compare those results across hardware, as testing with a game is unlikely to show you whether the CPU or GPU is acting as the bottleneck in the system.
With 3DMark you can see how your system performs under consistent and easily repeatable test conditions. It is easy to compare 3DMark scores from different hardware knowing that the tests are the same. 3DMark also lets you isolate the performance of the GPU and the CPU so that you can see which one is the weaker part of your system.
I hear people say ‘just use games instead’ as an argument against 3DMark. And naturally, if you are only interested in a couple of games, then those games will be the better test, at least they will be if they offer some way of getting repeatable results across multiple runs. Running 3DMark is like testing 50 different games and averaging the results, only much faster, easier and cheaper. 3DMark will tell you how a system handles games in general – and even for a single game, it will give you a good idea of what to expect.
The 3DMark scoring record was recently broken by almost 250 3DMark points. How does it make you feel to see 3DMark being embraced by the overclocking community?
The competitive overclocking scene is incredible. We are constantly amazed by the achievements of professional overclockers. It is exciting watching the scores come in whenever there is a new high-end hardware launch, wondering who will take the number one spot in our 3DMark Overclocking Hall of Fame.
We recently released 3DMark on Android and we’ve already found several forums where people are competing to get the highest scores. People are putting their phones in the freezer to keep the processor cool, using airplane mode to disable the radios and running tweaked ROMs to coax out more score. This week we visited HKEPC in Hong Kong who helped us overclock our HTC One while cooling it with liquid nitrogen. Not very practical, admittedly, but really fun.