Random Forests Tree Ensembles: Salford Systems Exclusive Insight

Report this content

Interview With Random Forests Co-Developer Dr. Adele Cutler, Remembering Dr. Leo Breiman

SAN DIEGO  Salford Systems has maintained long-term relationships with data mining visionaries like Random Forests co-developer Dr. Adele Cutler. In a recent visit to San Diego, she spent time with Salford Systems discussing plans for Random Forests future developments, offering an introductory session on Random Forests, and sharing some personal memories of her time working with Dr. Leo Breiman at the University of California, Berkeley.

Dr. Cutler shared the chance origin of their collaboration on Random Forests in an interview with Salford Systems' marketing staff.

"Leo was my advisor at U.C. Berkeley, recalls Cutler. "I did my Ph.D. at Berkeley with Leo from 1983-88, but we didn't work on decision trees at all in that time. I knew basically what they were but I didn't understand them very well .... Leo and I worked on optimization problems and archetype analysis when I was at Berkeley."

Cutler didn't even begin to work on decision trees, let alone ensembles methods, until the 1990s after becoming a faculty member at Utah State University. "I worked on mixture models for a while and that was fun while it lasted, but I began to feel like the applications weren't really there, " said Cutler. "So, I went to Leo one day and I said 'Look, Leo, I've come to the end of what I want to do with mixture models is there anything you can recommend for a direction for me to follow?'" Breiman was enthusiastic about neural networks and he recommended that Cutler begin attending conferences where this was being explored and discussed.

According to Cutler, "The Random Forests collaboration and my real start to working in [decision] trees came in a cab. It was a stretch limo actually! I was going to a conference and I had a habit of bumping into him at the airport attending all of these conferences on neural nets. We were trying to hail a cab to the conference hotel and they didn't have a cab available so they gave us a stretch limo at the regular rate. He was telling me about some work he was doing, and it was the early Random Forests. And I started telling him about some of the experiments that I'd been doing that were using [decision] trees. So, we took my project at the time, Perfect Random Trees, and his project Random Forests and immediately stopped working on everything else and began collaborating on RF." (Audio Clip)

Random Forests is widely available now and is documented as an excellent benchmark tool for data scientists and analysts. Much of the insight provided by Random Forests is generated by methods applied after the trees are grown and include new technology for identifying clusters or segments in data as well as new methods for ranking the importance of variables. The method was developed by Leo Breiman and Adele Cutler of the University of California, Berkeley, and is licensed exclusively to Salford Systems. Ongoing research is being undertaken by Salford Systems in collaboration with Professor Adele Cutler, the surviving co-author of Random Forests. Random Forests is a collection of many CART  trees that are not influenced by each other when constructed. The sum of the predictions made from decision trees determines the overall prediction of the forest. The algorithms is best suited for the analysis of complex data structures embedded in small to moderate data sets containing less than 10,000 rows but potentially millions of columns.

# # #

About Salford Systems

Founded in 1983, Salford Systems specializes in providing new generation data mining and choice modeling software and consultation services. Applications in both software and consulting span market research segmentation, direct marketing, fraud detection, credit scoring, risk management, bio-medical research and manufacturing quality control. Industries using Salford Systems products and consultation services include telecommunications, transportation, banking, financial services, insurance, health care, manufacturing, retail and catalog sales, and education. Salford Systems software is installed at more than 3,500 sites worldwide, including 300 major universities. Key customers include AT&T Universal Card Services, Pfizer Pharmaceuticals, General Motors, and Sears, Roebuck and Co. For additional information visit http://www.salford-systems.com.

Media Contact
Heather Hinman
Salford Systems
hhinman@salford-systems.com

Tags:

Documents & Links