Search This Blog

Wednesday, December 10, 2014

Facebook Opens Up Tools To Scale Memcached

Facebook Opens Up Tools To Scale Memcached

Facebook Opens Up Tools To Scale Memcached

When a back-end database can’t serve up bits of data fast enough to deal with the hit load on the web servers that sit in front of it, the common thing to do is to put some kind of caching software between the database and the web servers. In the open source world and increasingly among enterprises, the software most commonly chosen for this job is Memcached. Social media giant Facebook has come up with its own extensions to Memcached and has done the altruistic thing and opened them up for others to use.
Facebook Engineering, the hardware and software development arm of the company, hosted its fifth @Scale Conference in San Francisco, and the opening up of the mcrouter software that extends Memcached was the big news of the show. Facebook developed mcrouter for its own personal needs to make Memcached more malleable and scalable as it sits between the company’s MySQL database clusters and its web servers. As the name suggests, Memcached is a program for storing elements of a web page in the main memory of servers that are linked together and kept in synch as the web pages are updated in the production databases that are the foundation of the pages. Hitting the cache servers for stored elements is much quicker than pulling them out of the database, which allows for lower latency and higher throughput on web applications.
In a blog post, Facebook software engineers Anton Likhtarov, Rajesh Nishtala, and Ryan McElroy announced the opening up of the McRouter code, which is technically a Memcached protocol router. The software is mostly written in C++ with some libraries written in C and protocol parsing written in Ragel; the McRouter stack also uses the Folly and fbthrift libraries developed by Facebook. The latter is an improved version of the Facebook tool for automatically generating the client and server sides of remote procedure calls that was improved and open sourced earlier this year.
The Instagram picture hosting site adopted mcrouter as well in the wake of being acquired by Facebook and deployed it on the Amazon Web Services cloud before Instagram’s infrastructure was brought in-house to run in Facebook’s own datacenters in Oregon, North Carolina, and Sweden. Facebook says that mcrouter handles traffic across thousands of cache servers in dozens of clusters in its datacenters, and during peak loads handles close to 5 billion requests per second. News aggregation site Reddit has also deployed mcrouter in its infrastructure for a limited beta test. By opening up the mcrouter code under a BSD license, Facebook is hoping others can benefit from the work it has done to help Memcached scale better. The company is no doubt also interested in fomenting a community of developers to help make mcrouter better as they put it into production, improvements that Facebook itself will be able to benefit from in the future.
By the way, two and a half years ago, Twitter open sourced its own proxy server for Memcached and Redis NoSQL data stores, called Twemproxy, which allows web front ends and Memcached cache server pools to scale independently; it looks like mcrouter is a more sophisticated bit of code.
Any Memcached server or any other caching program that supports the Memcached protocols can be hooked into the mcrouter software. The software allows for Memcached servers, what are called destinations in mcrouter, to be pooled so multiple clients can come in on a single mcrouter instance and yet share outgoing connections from the Memcached servers. When a working dataset gets bigger than a single Memcached server instance, mcrouter creates a consistent hashing algorithm and shards the data and then spreads it across multiple Memcached servers. The mcrouter software also knows how to segment different datasets stored in the memory cache and split them into separate pools, and it can handle the updating of data on replicated pools that are created to give extra throughput for specific data. The mcrouter software also knows how to create backups of Memcached datasets and route requests to the backups in the event that a primary cache server in the cluster fails. There is a pool-level failover feature that allows for a local cluster of Memcached machines to fall back to a neighboring pool of cache servers in the event of a network failure, and groups of mcrouters can maintain consistency across Memcached clusters by broadcasting deletes of data globally across the routers and therefore their downstream Memcached servers.
Given all of this functionality, it would be interesting to see mcrouter, which has a BSD license, rolled into the actual Memcached project, which has a BSD-New license.
In addition to the opening up of the McRouter software, Facebook Engineering also said that it was teaming up with Box, Dropbox, Github, Google, Khan Academy, Stripe, Square, Twitter, and Walmart Labs to create a newopen source collaboration organization called TODO, which is short for “talk openly, develop openly.” The partners are being vague for the moment about precisely what TODO is, except to say that they are open source software advocates that “have come together to help solve the problem of utilizing and releasing open source software.” The idea is to have common tooling for creating, sharing, testing, and consuming open source software and to also share best practices relating to open source code.

Thursday, November 27, 2014

10 Skills You Need To Get $100000 Engineering Job At Google - Tech News | Latest Technology News

10 Skills You Need To Get $100000 Engineering Job At Google - Tech News | Latest Technology News

10 Skills You Need To Get $100000 Engineering Job At Google

Google is among the most sought after employers in the world. Engineers are the rock stars at Google — and they're paid like one.

Interns start at $70,000 to $90,000 salaries, while software engineers pull in $118,000 and senior software engineers make an average of $152,985. But one does not simply walk into the Googleplex.

The company receives upwards of 2.5 million job applications a year, but only hires about 4,000 people.

For would-be Googlers, the Google in Education team has released a list of skills that they want to see in potential engineers.

"Having a solid foundation in computer science is important in being a successful software engineer," the company says. "This guide is a suggested path for university students to develop their technical skills academically and non-academically through self-paced, hands-on learning."

Here are the skills Google wants its tech talent to master, complete with online resources to get you started...

1. Learn To Code
Learn to code in at least one object-oriented programming language, like C++, Java, or Python. Consult MIT or Udacity.

2. Test Your Code
It’s not just important to know how to code. You should also be able to test code, because Google wants you to be able to 'catch bugs, create tests, and break your software.'

3. Have Some Background In Abstract Math
It is important to have some background in abstract math, like logical reasoning and discrete math, which lots of computer science draws on.

4. Get To Know Operating Systems
Get to know operating systems, for they'll be where you do much of your work.

5. Become Familiar With Artificial Intelligence
Become familiar with artificial intelligence. Google loves robots.

6. Understand Algorithms And Data Structures
Google wants you to learn about fundamental data types like stacks, queues and bags as well as grasp sorting algorithms like quicksort, mergesort and heapsort.

7. Learn Cryptography
Learn cryptography. Remember, cybersecurity is crucial.

8. Learn How To Build Compilers
Stanford says that when you do that, 'you will learn how a program written in a high-level language designed for humans is systematically translated into a program written in low-level assembly more suited to machines.'

9. Learn Other Programming Languages
Add Java Script, CSS, Ruby and HTML to your skillset. W3school and CodeAcademy are there to help.

10. Learn Parallel Programming
Also, learn parallel programming because being able to carry out tons of computations at the same time is powerful.

Sunday, November 23, 2014

15 Things Successful People Do In Their 20s - Tech News | Techgig

15 Things Successful People Do In Their 20s - Tech News | Techgig

15 Things Successful People Do In Their 20s

Your 20s are a time of major transitions.
The choices you make in this critical decade lay the foundation for your career, relationships, health, and well-being.
While nothing can replace learning through firsthand experience, you can save some stress by listening to those who have already been through it.
We've scoured our archives and the web to find the best advice for how to make your 20s as enjoyable and productive as possible.
Here are 15 things that successful people do in their 20s:

1. They learn to manage their time.

When you're just starting to build your career, it can be difficult to arrange your days for maximum productivity.
As Etienne Garbugli, a Montreal-based entrepreneur and author, explains in his presentation "26 Time Management Hacks I Wish I'd Known At 20," setting deadlines for everything you're working on and avoiding multitasking are two keys to effectively managing your time.

2. They don't prioritize money above all else.

While there are those who spend their 20s drifting without direction, there are others who are so afraid of failure that they take a job solely because it provides a comfortable paycheck. But, says Quora user Rich Tatum, that job you're not interested in quickly becomes a career, and by the time you're 30, it's a lot harder to start pursuing your passion.
The key, says author Cal Newport, is to pursue something that you're passionate about and is valuable to employers.

3. They save.

A Bankrate survey of 1,003 people found that 69% of those ages 18-29 had no retirement savings at all. Twenty-somethings who don't have enough foresight to recognize that one day they're going to retire and need money to live on are missing out on years of money gained through interest.
Entrepreneur Aditya Rathnam says there's no need to start investing too much, since you're just starting your career, but it's essential to take advantage of your company's 401(k) matching program, if one is available, and/or open an IRA account.

4. They develop a debt repayment plan.

handful credit cards
Joe Raedle / Getty Images
Use cards to build credit, not rack up debt.
Seventy percent of college students graduated with an average of $30,000 in student loan debt last year, but that doesn't mean that debt is somehow a badge of adulthood.
Debt will start to haunt you, says Quora user Thea Pilarczyk. Develop a repayment plan that lets you pay off your loans as quickly as you are able to and is within your means, and use credit cards to build credit, not pay for things you can't afford.

5. They take care of their health.

As each year goes by, it becomes harder to start a sustained exercise regimen, and harder still to recover from a late night of drinking.
While you're still young, says Quora user Mo Seetubtim, develop healthy habits that will set you up for the next phase of life. Enjoy your vices in moderation, eat well, and choose a workout over a happy hour now and then.

6. They're persistent.

If you're an ambitious 20-something who thinks that adulthood means having things figured out, then getting fired from a job, ending a serious relationship, or having your company fail can be devastating. But the truly successful are able to learn from what went wrong and move forward all the wiser.
"Getting fired and waking up the next day as usual made me realize that failure isn't the end of the world. Getting dumped taught me the difference between a good and a bad relationship, something I already knew inside but refused to accept until the bad relationship was over," says Carolyn Cho on Quora.

7. They don't try to please everyone.

Your 20s are a time to start building a network that will establish a foundation for your career. If you know that, it's a good idea to be on friendly terms with your boss, clients, and all of your coworkers. Eventually, however, you're going to meet people you don't like and those who don't like you. That's normal, and not a sign that you should change yourself, as long as everything else is going well.
"Inevitably, someone will always dislike you. I wish I had figured this out a lot earlier and stopped trying so hard and worrying so much about it," says Cho.

8. They're flexible.

While it's good to set career goals that keep you focused and motivated, you should avoid getting caught up in intricate five-year plans, Joe Choisays on Quora.
Author and investor James Altucher says that one of the main problems he's found among people in their 20s is that they get caught up in absolutes. He recommends keeping yourself flexible and open to new experiences. There's a good chance that the ideal life you envisioned for yourself at age 20 doesn't resemble the one that ultimately makes you happy at age 30.

9. They keep learning.

Degrees from elite universities may make you smarter and help your reputation, but they won't count for much if you don't keep learning as you go.
Read as much as you can about your industry, and learn to develop skills that you probably never would study in a classroom like "the abilities to assimilate, communicate, and persuade," Tatum says.

10. They travel as much as possible.

woman sunset beautiful ibiza spain
See as much as you can before you get tied down.
When you're just starting out, you probably don't have much disposable income. But just because you can't take a week-long ski trip in Switzerland doesn't mean you should confine yourself to the space between work and home.
Your twenties, Shikhar Argawal says on Quora, are a time when "you are mature enough to go out on your own and immature enough to learn from others." Break out of your bubble as much as you can afford to, and don't ignore career opportunities far from home if they arise.

11. They maintain important relationships.

"Your college pals that you think will be your best pals for life? Some will still be there at 40, most will be living their lives doing their thing," saysSutherland Cutter on Quora. As everyone is figuring out their lives, you'll realize that relationships take work to maintain.
It's worth staying in touch with former coworkers and buddies, though. The 1973 study "The Strength of Weak Ties" by Mark Granovetter of Johns Hopkins University found that the weak ties you share with acquaintances are most often the connections that get you ahead, since they have access to different networks and ideas from you.

12. They let things go.

Picking fights and holding grudges will make you miserable, Tatum says, whether that's in your personal or professional life.
You'll realize soon enough that your hard work won't always be recognized, either, Rahul Bhatt writes on Quora. But never let that be an excuse to be lazy or bitter.

13. They think about the impact of their decisions.

You should definitely use the time when you're still single and without kids to take bigger risks than you otherwise would, but that isn't a call to live recklessly.
A decision you make in a few seconds off an emotional impulse "can rob you of years of joy and happiness," Tatum writes.

14. They understand that their parents aren't always right.

Quora user Arpan Roy writes that as he looks back on his 20s, he's come to see that even though he loves his parents and appreciates their advice, it wasn't always the best for him.
As you grow older, you'll come to see your parents less as authority figures and more as people just doing the best they can. "After all, your parents are human, and humans are not correct all the time," Roy says.

15. They're honest.

The deceitful manipulation of others and sucking up to superiors can only take you so far - they're not the keys to a lasting, fruitful career.
"The truth has a way of rearing its ugly head, so the sooner you can come to integrity with yourself and the world at large, the sooner you'll be able to get working towards what you really want, who you really want to be,"Arjuna Perkins says.

Monday, November 17, 2014

3 Theories About Programmatic Buying

3 Theories About Programmatic Buying

3 Theories About Programmatic Buying

"Data-Driven Thinking" is written by members of the media community and contains fresh ideas on the digital revolution in media.
Today’s column is written by Marcus Pratt, director of insights and technology at Mediasmith.
The question of how to define programmatic buying is nothing new. It’s a term frequently discussed and written about within the industry and here at AdExchanger. But when five executives are asked to define it, they do not naturally come to agreement.
When AdExchanger surveyed executives on this last year, what I found most interesting about the responses is how much they varied. However you define programmatic, there are some underlying assumptions I often hear:
1. Programmatic is done in real time, leverages ad exchanges or is synonymous with RTB, an acronym that can refer to real-time bidding or buying, which can occur without an auction.
2. Programmatic is more efficient because of process automation.
3. Programmatic is better, at least for the buyer.
All of these beliefs suggest a need for a different skill set in media buyers, with a transition away from the tasks associated with actual media planning. But how true are these common associations?
Theory No. 1: Programmatic Is Synonymous With RTB
While RTB efforts tend to be programmatic, the reverse is not always true: There is an increasing amount of programmatic buying taking place without an auction or any real-time decision-making.
Often referred to as programmatic premium or programmatic direct, the process of buying inventory directly from a publisher through software is clearly gaining momentum. Supply-side platforms are empowering this through creation of private exchanges that give buyers direct access to publisher inventory through the platform of their choice, but this process is far from automated. Frequently, this access is granted after a period of negotiation and once both sides agree on a fixed price for the inventory, sometimes with a volume commitment as well.
That sounds a lot like old-fashioned media buying. There are, however, two key benefits to the buyer that are created by running through a programmatic platform: the ability to accept or deny individual impressions, which enable first- or third-party data targeting and global frequency capping, and the ability to track the performance of these efforts in a central platform, applying data to holistic optimization efforts.
Theory No. 2: Programmatic Is More Efficient
The promise is great: Programmatic buying will allow media buyers to automate mundane processes, remove unnecessary steps, and spend more time on strategy, thus becoming more efficient. These much-lamented mundane tasks include faxing insertion orders, collating RFPs (who really faxes and collates these days?), back and forth negotiations and long rep lunches.
In reality, however, programmatic buying does not alleviate this workload. Consider Rocket Fuel, the DSP/ad network that just filed for an IPO after generating $106.6 million in 2012 revenue. All of the media Rocket Fuel buys is programmatic, but media buyers typically engage them with a standard IO. Another large programmatic ad company, Quantcast, recently implemented DocuSign to make paperwork processing more efficient amid a deluge of IOs.
To be fair, those who do their buying in-house can alleviate this need for an IO as they set up their own campaigns through a self-serve platform. Those with a desire to access private inventory through these systems, however, will find they need to maintain a relationship with the media publisher and negotiate on pricing. This can be done through a platform in many cases, but it may be easier to negotiate over the phone or face to face. I recently attended a lunch-and-learn with a major SSP, who indicated that negotiations through their platform often go “much smoother” once both parties have a conversation.
All of this requires media planning skills to evaluate proposals, negotiate buys and deliver the greatest value possible per dollar. The good news: Digital proposals don’t need to be collated.
Theory No. 3: Programmatic Is Better For Buyers
As shown by rapid growth, programmatic buying certainly has benefits. It isn’t, however, the solution to every advertiser’s problem.
Many are working to bring more custom creative and rich media solutions to programmatic buys, but let’s face it: Most of the time we are talking about standard IAB ad units that elicit eye-rolling from bored creative directors.
So if briefs call for custom, unique executions, working directly with publishers may be a better bet. With so much scale efficiency, programmatic is certainly applicable to direct response media briefs, but sometimes even the best DR ad units are not available programmatically. Consider fixed position banners, text ads, newsletters and native advertising offerings – many can be effective at driving an action, and they cannot always be purchased through a DSP. Testing as part of a programmatic media budget can be a great way to gauge a site’s performance without committing significant budget. If a website performs well as part of a programmatic effort, buyers should consider issuing an RFP directly to the site to see if there are additional offerings available only through a direct relationship.
So how should agencies and advertisers plan for programmatic? Is this a specialized skill, or just an extension of digital media planning? Cop-out that it may be, I think the answer is “both.”
Setting successful strategies in programmatic is going to require the expertise of a media planner working alongside a counterpart well versed in the technology that enables programmatic buying. Strategists should spend time with media technology experts and understand what they are doing. These digital experts should understand the marketing strategy well enough to come forward with ideas on how to leverage technology in reaching campaign objectives.
Agencies realize this as more firms bring programmatic in-house or work more closely with their trading desk counterparts. Yet in many cases, there is still a wall -- sometimes literally -- between strategy and programmatic teams. If this is the case in your organization you should consider tearing it down.

Sample Size Calculator by Raosoft, Inc.

Sample Size Calculator by Raosoft, Inc.

If 50% of all the people in a population of 20000 people drink coffee in the morning, and if you were repeat the survey of 377 people ("Did you drink coffee this morning?") many times, then 95% of the time, your survey would find that between 45% and 55% of the people in your sample answered "Yes".
The remaining 5% of the time, or for 1 in 20 survey questions, you would expect the survey response to more than the margin of error away from the true answer.
When you survey a sample of the population, you don't know that you've found the correct answer, but you do know that there's a 95% chance that you're within the margin of error of the correct answer.
Try changing your sample size and watch what happens to the alternate scenarios. That tells you what happens if you don't use the recommended sample size, and how M.O.E and confidence level (that 95%) are related.
To learn more if you're a beginner, read Basic Statistics: A Modern Approach and The Cartoon Guide to Statistics. Otherwise, look at the more advanced books.

In terms of the numbers you selected above, the sample size n and margin of error E are given by
n=N x/((N-1)E2 + x)
E=Sqrt[(N - n)x/n(N-1)]
where N is the population size, r is the fraction of responses that you are interested in, and Z(c/100) is the critical value for the confidence level c.

If you'd like to see how we perform the calculation, view the page source. This calculation is based on the Normal distribution, and assumes you have more than about 30 samples.

About Response distribution: If you ask a random sample of 10 people if they like donuts, and 9 of them say, "Yes", then the prediction that you make about the general population is different than it would be if 5 had said, "Yes", and 5 had said, "No". Setting the response distribution to 50% is the most conservative assumption. So just leave it at 50% unless you know what you're doing. The sample size calculator computes the critical value for the normal distribution. Wikipedia has good articles on statistics.

Are Marketers Actually Measuring Ad Viewability? AdExchanger And Moat Run The Numbers

Are Marketers Actually Measuring Ad Viewability? AdExchanger And Moat Run The Numbers

Are Marketers Actually Measuring Ad Viewability? AdExchanger And Moat Run The Numbers

Media buyers and suppliers are now authorized to transacton viewable impressions, but how many are even using this form of tracking?
To measure the adoption of viewability tracking and other forms of verification among national and global advertisers, AdExchanger recently worked with ad analytics firm Moat.
We supplied Moat with a list of 100 large online advertisers (a modified version of the Fortune 100, weighted to emphasize large digital marketers). Moat then used its crawler-based technology to determine how often those brands' ad tags are served alongside tags from any of four ad verification vendors known for tracking viewable impressions: Integral Ad Science, DoubleVerify, comScore and Google-owned Adometry. (Moat also provides viewability metrics, but its solution was excluded from this analysis.)
Here's what Moat found:
From 2013 to the first half of 2014, when the Media Ratings Council lifted its advisory warning against transacting on viewability, Moat observed a doubling of the number of advertisers running ad verification tags on at least half of their ad impressions, from 15% to 30%. Likewise, the number of advertisers running those ad tags on at least a quarter of their impressions also approximately doubled, from 32% in full-year 2013 to 60% in the first half of 2014.
So according to this experiment, a majority of major advertisers are verifying ads on at least some of their impressions. That's not too shabby, from an adoption standpoint.
But it's also not yet widespread. The four vendors profiled are prominent in the ad verification market. Given that fewer than one-third of advertisers on the AdExchanger list were observed to work with one of them on a majority of ad placements, full adoption of viewability measurement may be many months or years away.
Methodology: crawls about 10,000 sites a day, and visits each of those sites a few hundred times a day, to build its index of digital advertising practices in the US and UK. The four providers of viewability metrics tracked for this analysis also provide other services such as fraud measurement, ad verification and brand safety tracking that are not within the scope of this research exercise. Therefore it's possible that of the 60% of advertisers observed to be working with vendors such as DoubleVerify and comScore, not all are measuring viewability exclusively with those firms.
"Most of the tags do more than just viewability but I know from our interactions with marketers, viewability is the first thing everyone talks about," said Moat CEO Jonah Goodhart. "Fraud detection has also become very popular recently," a trend he said expects to continue.

The Myths And Realities Of Advertising Algorithms

The Myths And Realities Of Advertising Algorithms

The Myths And Realities Of Advertising Algorithms

"Data-Driven Thinking" is written by members of the media community and contains fresh ideas on the digital revolution in media.
Today’s column is written by Jay Friedman, chief operating officer at Goodway Group.
I thought the days when companies promised awesome algorithms without offering any details were over. Companies like AppNexus or DataXu are transparent with their algorithms, which are used to optimize digital media. Yet I still hear others pitch “black box” algorithm cure-alls.
While it’d be great to put this debate to bed, it’s not that easy. While researching this piece, for example, I found that for every data scientist who tells you the right way to solve how to identify the best way to develop a performance algorithm, three others claim the equation is wrong and propose something different.
To shed light on the algorithm debate, I’ve laid out a few different perspectives below that are meant to be common sense. I’m not a data scientist, but you shouldn’t have to be one to understand what you’re being sold. This is for you, the media buyer, the client-side marketing executive trying to make sense of big data, and the media sales rep who wants the complex broken down into something simpler.
A Primer On Sample Size
For starters, it’s important to understand that any algorithm must have enough data about a given combination of variables to decide its value. For instance, you wouldn't take a poll of just one person and project the national presidential election because the sample size would be too small. In a race with at least two presidential candidates, you’d need a sample size of 1,067 according to this sample size calculator, if you assumed there are 180 million registered voters and wanted a margin of error of 3% and a 95% confidence level. This is sufficient because the top candidate may be favored by 51%, and the other 46%, with some unknowns. With the 3% margin of error, even if this poll were taken hundreds of times, the candidate with 51% would receive between 48% and 54% of votes 95% of the time. Margin of error works on a bell curve like this, which assumes a 95% confidence level:
OK, maybe math wasn’t your favorite subject, but don’t be intimidated. To read the above, just note that the curve “bunches together” more with a lower margin of error. This just means there’s a much better chance your figures are accurate.
But what if there were 500 viable candidates and none were heavily favored? The top candidate gets 0.8%, the lowest 0.02%. With so many bunched together, even the 2% curve above leaves us uncertain how truly separated the candidates are. Therefore, you might need to increase the sample size.
Here's how this translates to a digital display or video-advertising program (mobile is a bit different).
Five Hundred Candidates or 3.6 Billion Value Combinations
In a typical RTB campaign, 50,000 impressions that ran on a random news website generated a 0.1% conversion rate versus a goal of 0.08% conversion rate. This equates 50 total conversions. However, if you break it down further, you see 48 of the 50 conversions occurred between 7 a.m. and 10 a.m. Within those 48, 35 occurred on a Monday. Within those 35, 27 were on Windows 7 operating-system (OS) machines. You can see how quickly this unfolds, and it could go on further by adding in more variables.
The key takeaway here is this random news website isn't necessarily a great site as a lone variable. It's good at certain times, on certain days, with certain other variables applied.
How many variables and outcomes do you need to take into account when examining a buy? The following are conservative values that actually hurt my case.
Days of Week
Hours of Day
Ad Sizes
Data Segments
Creative Treatments
Total Unique Combinations

That's right: More than 3 billion unique combinations can be taken into account – and this is conservative. It's probably more like 50,000 sites, 20 data segments, and so on, which would make the number much larger.
Being an advertising major and not formally trained in statistics, I consulted two professional statisticians. These gentlemen advised on some of the techniques below to ensure I followed best practices within their industry.
There are two ways to look at this: We can "project forward" for sample size, as in the earlier presidential poll example, or "look backward," since this is a scenario where we already have data, assuming the buy has already ran. When projecting forward, due to so many unique combinations, it’s likely the performance of millions of combinations will bunch up within 0.001% of each other. Going back to that sample size calculator, with an Internet population of 214 million, to get down to a 0.001% margin of error, your sample size now needs to be more than 209 million. That’s a lot to “sample” before knowing what performs and what doesn’t. But this really doesn’t feel right. So let’s “look backward” instead.
To look backward, let’s determine how many "observations" or impressions we need per unique combination of values to derive a statistically valid and confident decision. Per a whole lot of amazingly sleep-inducing Internet chat forums on the subject, there are some instances where 10 observations will suffice and other instances where 30 or 40 will be considered reasonable. Even if 10 observations or impressions are enough, are you running 36.2 billion impressions per flight? This certainly won’t work, so maybe it’s time to give up on the notion of understanding every unique, detailed combination.
Just Be Better Than a Human with Excel     
Yes, the perfect algorithm should theoretically explore every combination within the variables. But the example above proves this too unlikely, and no algorithm is perfect. Conversely, we don’t need an algorithm that only looks at a single variable. A human could do that with the “sort” function in Excel. Going back to our random news website, let’s say the algorithm looks at just two variables at a time, such as site and data segment, browser and hour of day, or site and day of week. We could argue certain variables are more important than others, but we’re talking magic in a box here. Surely it can calculate any two variables at a time.
To do this, we need the total number of individual pairs among the values in these variables. Cutting the number of sites down to 1,000 to further prove the point, I’d love to tell you I know the formula to figure this out, but three minutes in Excel multiplying each column out gave me 59,284 unique pairings.
You’ll remember some stats folks suggesting 10 observations or impressions per combination would be enough. Would you optimize anything off of 10 impressions? Even 100? Since we’re trying to be more realistic but still conservative, we’ll use 1,000 impressions per combination of values. Now we’re up to 59,284,000 impressions needed to get good data across all two-value pairs. Use a more realistic threshold like 5,000 per combination, and we’re up to more than 295 million. How many of you are running this type of buy, with one vendor, in one flight?
Rather than looking at all of the media variables mentioned above, it might be easier to pivot our viewpoint and look at users instead. This would suggest the algorithm is going to optimize against users and not the media variables like site, time of day and so on. To do so this, we need to look at frequency. Going back to the notion of “observations,” research shows us 10 is actually an OK number. We’ve looked at thousands of campaigns and seen that a monthly frequency of eight to 12 is needed before we see results diminish in efficiency. OK, time out: It’s common-sense gut-check time.
If you need roughly 10 impressions per user before you know whether or not to optimize that user in or out of the buy, you’ve also served that user enough impressions to make him or her convert if it was going to happen. No point in optimizing against that user now, you already know the outcome.
The 100 Millisecond Response
Moving away from statistics, let’s address the myth that “no human can make decisions within the 100 milliseconds we have to make an RTB decision.” That’s correct, but algorithms don’t, either.
The reality is most RTB-ecosystem participants cache their line items and, therefore, their bids, so they can respond within the 100 milliseconds and not be timed out. In order to cache these line items and bids, the algorithm has to work independently of the bidding, establish new line-item values in the system, and then allow the system to cache those. Even though they’re working independently, they theoretically could repopulate and recache thousands of times per second. Are they? Find out for yourself by asking your RTB rep. If proud of the answer, he or she will tell you.
Bringing It All Together
At this point you might ask what the point of an algorithm is at all. The point of this piece isn't to pick on any specific algorithm but to give color around the lunacy that says any digital media algorithm can work magic.
If you’re executing a $50,000 buy with a vendor, take some time and do the math before you decide to just leave it to people who say they have an algorithm. A good algorithm should be transparent, and the company’s work and limitations behind it should be as well. Companies should tell you when they can improve performance and when they can’t, when there simply isn’t enough data.
And they should be willing to give you the data if you would like to review it or make decisions yourself. If I looked like George Clooney and wanted to try to get a date, I wouldn't go out on the prowl with a bag over my head. Those who are confident in their product will show it off and answer your questions without a guarded response.

Sunday, November 16, 2014

20 Free Data Visualization Tools - Tech News | Techgig

20 Free Data Visualization Tools - Tech News | Techgig

20 Free Data Visualization Tools

We hope you will find the list useful for your tasks. Enjoy !!

1. Zoho Reports

zoho reports
Zoho Reports is an online reporting and business intelligence service that helps you easily analyze your business data, and create insightful reports & dashboards for informed decision-making. It allows you to easily create and share powerful reports in minutes with no IT help.

2. Visulize Free

Visualize Free is a free visual analysis tool based on the advanced commercial dashboard and visualization software developed by InetSoft, an innovator in business intelligence software since 1996. Visualization is the perfect technique for sifting through multi-dimensional data to spot trends and aberrations or slice and dice data with simple point-and-click methods.

3. TimeFlow

TimeFlow Analytical Timeline is a visualization tool for temporal data. his tool is currently in alpha version so there is a chance of finding glitches. It provides five different displays: timeline view, calendar view, bar chart view, and table view.

4. Gnuplot

Gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. The source code is copyrighted but freely distributed (i.e., you don’t have to pay for it). It was originally created to allow scientists and students to visualize mathematical functions and data interactively, but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Gnuplot has been supported and under active development since 1986.

5. Dipity

Dipity lets you create a free digital timeline. It allows creating, sharing, embedding and collaborating on an interactive and visually attractive timelines has the ability of integrating video, audio, images, text, links, social media, location and timestamps etc.

6. Gri

Gri is an extensible plotting language for producing scientific graphs, such as x-y plots, contour plots, and image plots.


easel ly is a simple web tool that empowers anyone to create and share powerful visuals (infographics, posters)… no design experience needed! We provide the canvas, you provide the creativity.

8. D3.js

D3 is a small, free JavaScript library for manipulating HTML documents based on data. D3 can help you quickly visualize your data as HTML or SVG, handle interactivity, and incorporate smooth transitions and staged animations into your pages.

9. Creately

Creately lets you crate beautiful diagram, flowcharts, UML, UI mockups and many more. It offers 50+ types of diagrams with specialised features to help you draw fast and better, it offers real time collaboration and projects help you work with clients and colleagues .

10. Open Graphiti

OpenGraphiti is a free and open source 3D data visualization engine for data scientists to visualize semantic networks and to work with them. It offers an easy-to-use API with several associated libraries to create custom-made datasets. It leverages the power of GPUs to process and explore the data and sits on a homemade 3D engine.

11. Humble Finance

HumbleFinance is an HTML5 data visualization tool that looks and functions similar to the Flash chart in Google Finance. It makes use of the Prototype and Flotr libraries and is not limited to displaying financial data but any two 2d data sets which share an axis.

12. Fusion Table

fusion table
Fusion table is new service by Google, Fusion Tables is an experimental data visualization web application to gather, visualize, and share data tables. You can create all kinds of layouts with fusion tables, even custom ones; and you can analyze millions of rows if necessary.

13. Gephi

Gephi is an interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs. Gephi, a graph-based visualiser and data explorer, can not only crunch large data sets and produce beautiful visualizations, but also allows you to clean and sort the data.

14. Open Refine

OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; extending it with web services; and linking it to databases likeFreebase.

15. Raw

Raw is a free and open source web application for visualizing data flexibly and as easy as possible. It actually defines itself as “the missing link between spreadsheet applications and vector graphics editors”. The application works by loading a dataset by copy-posting or drag ‘n’ dropping and allows us to customize the view/hierarchy.

16. R Project

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.

17. Exhibit

Exhibit lets you easily create web pages with advanced text search and filtering functionalities, with interactive maps, timelines, and other visualizations.

18. JavaScript InfoVIS Tool

JavaScript InfoVis Toolkit
The JavaScript InfoVis Toolkit provides tools for creating Interactive Data Visualizations for the Web. This library has a number of unique styles and swish animation effects, and is free to use.

19. Pizza Pie Charts

Pizza Pie Charts is a responsive Pie chart based on the Snap SVG framework from Adobe. It focuses on easy integration via HTML markup and CSS instead of JavaScript objects, although you can pass JavaScript objects to Pizza as well.

20. Axiis

Axiis is an open source data visualization framework designed for beginner and expert developers alike. Axiis gives developers the ability to expressively define their data visualizations through concise and intuitive markup.