Premium Essay

Multivariate Analysis

In: Business and Management

Submitted By arjunsuvarna
Words 1611
Pages 7
Multivariate Analysis of Bike Sharing Demand



Srikanth Pisipati


Lavina Choudhary

1. What is Bike Sharing System?
It is a means of renting the bicycles where the process of renting, returning and membership is an automated process using a network of kiosk location throughout a city. So a person can rent a bike from one location and can return it to different location.
2. Introduction/Objectives:
Bike sharing data is a huge data used to do a research and predict the demand in future based on different attributed like wind speed, hour, peak time, humidity, temperature, season, holiday, working day. And, it is important to analyze so as to understand the duration of travel departure location, arrival location of different places. So, for the same we are using the bike share data with historical patterns in the Capital Bike share program in Washington, D.C.
3. Data Analysis/ explanation of data set:
We are taking hourly data over the span of 2years .Then we split the data into 2 sets: Training data set which comprises of 10000 records and Testing Data set comprises of 6000 records.
Training Data set: It is comprised of 1-19th days of each month
Testing Data set: It is comprised of 19th to end of month
So, we will predict the total bike demand in training data set for each hour and then we will test it on the testing data.

4. Attribute Explanation:
Date time

hourly date + timestamp

Continuous Variable


1 = spring, 2 = summer, 3 = fall, 4 = winter

Categorical Variable


whether the day is considered a holiday whether the day is neither a weekend nor holiday

Categorical Variable

Working day

Categorical Variable


1: Clear, Few clouds, Partly
Categorical Variable cloudy, Partly cloudy
2: Mist + Cloudy, Mist +
Broken clouds, Mist + Few clouds, Mist

Similar Documents

Premium Essay

Discriminate Analysis

...Discriminant Function Analysis undertakes the same task as multiple linear regression by predicting an outcome. However, multiple linear regression is limited to cases where the dependent variable an interval variable. Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor or independent variables are interval in nature. In many cases dependent variable consist of two or more groups or classification like male, female; low , medium, high etc. * When the criterion(dependent) variable has two categories, the technique is known as two-group discriminant analysis. * When three or more categories are involved, the technique is referred to as multiple discriminant analysis. DA is used when: The dependent is categorical with the predictor IV’s at interval level such as age, income, attitudes, perceptions, and years of education. There are more than two DV categories, Assumptions of discriminant analysis The major underlying assumptions of DA are: The observations are a random sample; Each predictor variable is normally distributed; each of the allocations for the dependent categories in the initial classification are correctly classified; Sample size: Unequal sample sizes are acceptable. The sample size of the smallest group needs to exceed the number of predictor variables. As a “rule of thumb”, the smallest sample size should be at least 20 for a few (4 or 5) predictors. The......

Words: 815 - Pages: 4

Premium Essay

Ratio Analysis

...working on this specific assignment I decided to concentrate on financial ratio analysis, since I am the business owner, and most of the financial terms like balance sheet, shareholder’s equity, EBITDA, EBITDAM, financial ethics, financial benchmarking I am very familiar with. I must admit that understanding financial ratio analysis I found somehow difficult, this is why I decided to concentrate on this topic. Summary of Articles The first article that I read is called “Financial Ratios, Discriminant Analysis and the Predictions of Corporate Bankruptcy” by Edward I. Altman, published 1968 in the Journal of Finance. The article says that academicians are seeking to eliminate ratio analysis as an analytical technique in assessing the performance of a business. According to the article theorists are attacking the relevance of ratio analysis. The article explores the possibility of whether the gap between traditional ratio analysis and more rigorous statistical techniques can be bridged. According to the article the traditional ratio analysis is no longer an important analytical technique in the academic environment because of the unsophisticated manner in which it has been presented. The research combined financial ratios with discriminate analysis, applying this to the problem of corporate bankruptcy prediction. The article concludes that if the ratios are analyzed within a multivariate framework they take on greater statistical significance than ratio......

Words: 1459 - Pages: 6

Free Essay

Multivariate Analysis

...Multivariate Discriminant Analysis Priyanshi Gupta An Overview  MDA is a statistical technique used to classify an observation into one of the several a priori groupings dependent on the observation’s individual characteristics. It is used primarily to classify and/or make predictions in the problems where dependent variable comes in qualitative form, for example, male or female, bankrupt or non-bankrupt etc.  So the first step is to establish explicit group classifications. We have got observations coming from k groups. We are trying to look at what is the best way or best function in order to discriminate observations coming from different groups.  Once such function is in place, we go to classification which basically is the problem of classification of a new observation into appropriate population using the discriminant function.  So typically in such problems, once you have a set of data (called LEARNING set of data) with observations possibly coming from different populations are pre-classified, having predefined memberships to the groups. And based on the particular previously classified data, we create a discriminant function and can use it after proper calibration to classify a new observation to be coming from one of the groups.  Discriminant analysis is used when groups are known a priori. Types of DA Problems  2 Group Problems... …regression can be used  k-Group Problem (where k>=2)... …regression cannot be used if k>2 Example of a...

Words: 6778 - Pages: 28

Free Essay

Multivariate Data

... Multivariate data is a key part of any interaction in business. The data can be used to anticipate the effect of several variables. Multivariate relationships involve multiple independent variables affecting a dependent variable. These independent variables have a distinct and measurable effect on the dependent variable. These relationships can be used by managers to make decisions. The example given is that of an automobile manufacturer that uses the data to change the methods of scheduled maintenance without affecting the longevity of the vehicle. Multivariate data can show managers how different aspects can affect an outcome. Multivariate Data Multivariate data is a system of relationships that governs nearly any interactions between objects. These data relationships show how one set of variables can have an effect on another. Whenever something happens, it happens because of many factors that come into play; several things have to come together to create the effect observed. This is true of things in nature, occurrences in life, and decisions in business. Multivariate relationships are everywhere, and the effect they have is widespread. The ability to recognize and analyze these variables can be a strong asset in business management as understanding what drives certain effects can allow a manager to more accurately predict outcomes. Being able to accurately model what is going to happen is a distinct advantage for any manager. Multivariate......

Words: 908 - Pages: 4

Premium Essay

Factor Analysis Preliminary analysis Since the data of all variables are non-metric and ordinal, there is no appropriate preliminary analysis can be performed. The descriptive statistics of the variable (table 1) below show that most respondent’s answers falls between 3 to 4 and close to 4 which denote the perception of neutral to agree. That is, most answers are positive indicating the students are not on average dissatisfy with the teaching of the subject. Figure 1 | |N |Mean |Std. Deviation | |The tutorials are useful and |434 |4.09 |.922 | |relevant | | | | |The tutor is well prepared |436 |4.11 |.962 | |The tutor is knowledgeable |432 |4.19 |.807 | |The tutor is approachable |432 |4.31 |.825 | |The tutor is readily available for |434 |3.93 |.989 | |consultation with students | | | | |In this tutorial I feel free to ask|433 |4.26 |.780 | |questions | | | | |The tutor encourages students to |435 |3.77 |.916 | |think for......

Words: 2766 - Pages: 12

Premium Essay

Applied Multivariate Modeling

...Applied Multivariate Modeling Hierarchical Regression and Correlation The objective of this assignment is to understand and evaluate hierarchical regression analysis by incorporating a forced entry and stepwise regression method into factors that may have an effect on the salary of supermodels. Field (2009) suggests the use of forced entry method when the researcher chooses to use a variable as the primary predictor, which in this age is being used to predict salary. The stepwise regression method can be used by a researcher to allocate variables into the hierarchical model in stages, and thereby, analyzing how the best IV correlates with the dependent variable. Genell, Nemes, Steineck, and Dickman (2010) contends that the stepwise regression method is best used to validate each step for an effective “model building process” (p. 2). The stepwise regression method for this study included the use of gender, years, and beauty as the additional independent variables (IV). For the purpose of this assignment, a dummy variable (gender) is added with the first 100 respondents are coded as male and the remaining 131 are coded as female. The assumed level of measurement for gender is scale. Through prior literature research, age is considered to be a strong predictor for a supermodel’s salary. Therefore, age is entered in the first step followed by years of modeling, gender, and attractiveness using the stepwise entry method. Hierarchical Regression Analysis Based......

Words: 1206 - Pages: 5

Free Essay

Conjoint Analysis

...Business Research Methods Project On Conjoint Analysis Identifying Key Product Attributes & Product Designing of Mobile Phones Abstract This paper intends to explore consumer preferences for Mobile phones attributes, to determine the optimal combination for consumers, and to provide manufacturers a reference for their marketing strategies. In this study, consumers were divided into several demographics (age, gender, occupation) and individual preferences for various mobile phone attributes were compared. Consumers showed significant demographical difference in their preferences over the combination of mobile phones attributes. The various combination of mobile phone attributes were grouped together. Subjects were asked to rank the 22 product profiles (Pair1 to Pair22) from the most to the least preferred. The variables Pref1 through Pref7 contain the IDs of the associated product profiles, that is, the card IDs. Subject 1, for example, liked pair13 most of all, so PREF1 has the value 13. Analysis of the data is a task that requires the use of command syntax—specifically, the CONJOINT command. The necessary command syntax has been provided in the file conjoint.sps. Literature Review N. Soutar et al. (2008), the study aims to examine its relevance in exploring the trade-offs followers make about leaders. The aim was to have an integrated understanding of leadership. The leaders were assessed on the eight leader attributes obtained from three focus......

Words: 3607 - Pages: 15

Premium Essay

Social Analysis

...Knowledge-Based Systems 30 (2012) 67–77 Contents lists available at SciVerse ScienceDirect Knowledge-Based Systems journal homepage: Bankruptcy prediction models based on multinorm analysis: An alternative to accounting ratios Javier de Andrés ⇑, Manuel Landajo, Pedro Lorca University of Oviedo, Spain a r t i c l e i n f o a b s t r a c t In this paper we address the bankruptcy prediction problem and outline a procedure to improve the performance of standard classifiers. Our proposal replaces traditional indicators (accounting ratios) with the output of a so-called multinorm analysis. The deviations of each firm from a battery of industry norms (computed by nonparametric quantile regression) are used as input variables for the classifiers. The approach is applied to predict bankruptcy of firms, and tested on a representative data set of Spanish firms. Results indicate that the approach may provide significant improvements in predictive accuracy, both in linear and nonlinear classifiers. Ó 2011 Elsevier B.V. All rights reserved. Article history: Received 9 February 2011 Received in revised form 2 October 2011 Accepted 3 November 2011 Available online 30 December 2011 Keywords: Bankruptcy prediction Classification techniques Nonparametric methods Quantile regression Accounting ratios 1. Introduction Under the current economic conditions, bankruptcy early warning systems have become tools of key importance in order to guarantee......

Words: 10207 - Pages: 41

Free Essay

Strategu Analysis

...(+)16 (+18) +9% Source: Bangladesh Parjatan Corporation (BPC), 2009 The above table 1.1 shows that the number of tourist arrivals in Bangladesh has increased to 397,410 in 2007 from 113,242 in 1991 which shows an average annual growth rate of 9 percent. The tourist arrivals increased in 2003 by 18 percent and 16 percent in 2007 over the preceding year. In general, the statistics show a very good and positive trend. This rate can be considered very high for those countries that have already matured in the market, but for a new market entry, like Bangladesh, the above growth rate is not very impressive. By using the above data the projected number of tourist arrivals for the year 2010 and 2020 can be calculated with the help of regression analysis where, the model: y = mx + b Here, x is independent variable (year) and y is dependent variable (total number of tourist arrivals) Slope = m = n(∑ xy ) − (∑ x)(∑ y ) n( ∑ x 2 ) − ( ∑ x ) 2 Intercept = b = ∑ y − m( ∑ x ) n By interpreting and solving this we get the value m = 15268.69 and b = -30314883and then the model stands as: y = 15268.69 (x) -30314883 and the projected tourist arrivals are 375,186 and 527,873 for the year 2010 and 2020 respectively subject to the present trend remaining unchanged. The BPC forecast that Bangladesh will receive 0.9 million tourists in 2020 seems very unrealistic. 7. Bangladesh Tourism Marketing Strategy 10 The most important challenge for destination marketing therefore......

Words: 7103 - Pages: 29

Premium Essay

Data Analysis

...DATA ANALYSIS TAKE-HOME EXAM Introduction This report provides an insight into the investment behavior of 50 couples. Using different statistical methods and observing the trend followed by the effect of various independent variables on a single independent variable , a conclusion will be reached. The following are the main tools used for analysis of investment behavior of 50 couples who are selected from a sample size of 194 couples. 1. Descriptive statistics 2.Histograms 3.Pivot tables 4.Multiple Regression In this model , a scrutiny of the above statistical data will give the tendency to invest in retirement plans and the type of couples who invest and take advantage of the attractive investments in order to avail tax exemption. This report also elaborates on how the different independent variables - Number of children, Salary, Mortgage and Debt- have an effect on the dependent variable, i.e. the percentage of salary invested. Consequently the below tasks will be fulfilled. Step 1-Extracting a sample of 50 couples Step 2-Constructing histograms and point estimates with given confidence intervals Step 3-Inference from pivot tables to explain the preferences of different couples on investments based on independent variables Step 4-Performing multivariate regression and conducting significance tests on beta coefficients R^2 and F-tests and hence establish a correlation of different variables and ensuing effect on investments made. Dataset We have a......

Words: 1895 - Pages: 8

Free Essay

Analysis Quantitative

...                    PERSONAL  ASSIGNMENT     DATA  ANALYSIS       BY      LORENZO  CORONATI                         Prof.  Maurizio  Poli     Via  Bocconi  8     Office  room:  517  (5th  floor)     E-­‐Mail:                               1     1. PRELIMINARY  ANALYSIS       The   main   scope   of   the   work   and   the   data   analysis   consist   in   developing   a   multiple   linear   regression   model   capable   of   demonstrate   the   function   between   ITC   cost   and   the   selected   independent  variables.  All  data  in  this  work  have  been  extrapolated  from  Dataset  Eurostat   Datawherehouse.     The   statistical   units   that   have   been   studied   are   the   15   countries   of   the   European   Community  as  described  in  table  1.   It   has   been   utilized   for   the   analysis   a   software   called   JMP   provided   by   SDA   Bocconi,   University ......

Words: 1287 - Pages: 6

Premium Essay

Cvp Analysis

...CVP analysis: A tool for business decision making Introduction Cost-Volume-Profit Analysis (CVP), in managerial economics is a form of cost accounting. It is a simplified model, useful for elementary instruction and for short-run Cost-volume-profit (CVP) analysis expands the use of information provided by breakeven analysis. A critical part of CVP analysis is the point where total revenues equal total costs (both fixed and variable costs). At this breakeven point (BEP), a company will experience no income or loss. This BEP can be an initial examination that precedes more detailed CVP analyses.Cost-volume-profit analysis employs the same basic assumptions as in breakeven analysis. Cost-volume-profit analysis (CVP), or break-even analysis, is used to compute the volume level at which total revenues are equal to total costs. When total costs and total revenues are equal, the business organization is said to be “breaking even.” The analysis is based on a set of linear equations for a straight line and the separation of variable and fixed costs. Total variable costs are considered to be those costs that vary as the production volume changes. In a factory, production volume is considered to be the number of units produced, but in a governmental organization with no assembly process, the units produced might refer, for example, to the number of welfare cases processed. There are a number of costs that vary or change, but if the variation is not due to volume changes, it is not......

Words: 3133 - Pages: 13

Premium Essay


...Topic Gateway Series Strategic Analysis Tools Strategic Analysis Tools Topic Gateway Series No. 34 Prepared by Jim Downey and Technical Information Service 1 October 2007 Topic Gateway Series Strategic Analysis Tools About Topic Gateways Topic Gateways are intended as a refresher or introduction to topics of interest to CIMA members. They include a basic definition, a brief overview and a fuller explanation of practical application. Finally they signpost some further resources for detailed understanding and research. Topic Gateways are available electronically to CIMA Members only in the CPD Centre on the CIMA website, along with a number of electronic resources. About the Technical Information Service CIMA supports its members and students with its Technical Information Service (TIS) for their work and CPD needs. Our information specialists and accounting specialists work closely together to identify or create authoritative resources to help members resolve their work related information needs. Additionally, our accounting specialists can help CIMA members and students with the interpretation of guidance on financial reporting, financial management and performance management, as defined in the CIMA Official Terminology 2005 edition. CIMA members and students should sign into My CIMA to access these services and resources. The Chartered Institute of Management Accountants 26 Chapter Street London SW1P 4NP United Kingdom T. +44 (0)20 7663 5441 F.......

Words: 3971 - Pages: 16

Premium Essay

Analysis of Data

... The terms "statistical analysis" and "data analysis" can be said to mean the same thing -- the study of how we describe, combine, and make inferences based on numbers. A lot of people are scared of numbers (quantiphobia), but data analysis with statistics has got less to do with numbers, and more to do with rules for arranging them. It even lets you create some of those rules yourself, so instead of looking at it like a lot of memorization, it's best to see it as an extension of the research mentality, something researchers do anyway (i.e., play with or crunch numbers).  Once you realize that YOU have complete and total power over how you want to arrange numbers, your fear of them will disappear.  It helps, of course, if you know some basic algebra and arithmetic, at a level where you might be comfortable solving the following equation There are three (3) general areas that make up the field of statistics: descriptive statistics, relational statistics, and inferential statistics.  1. Descriptive statistics fall into one of two categories: measures of central tendency (mean, median, and mode) or measures of dispersion (standard deviation and variance). Their purpose is to explore hunches that may have come up during the course of the research process, but most people compute them to look at the normality of their numbers. Examples include descriptive analysis of sex, age, race, social class, and so forth.  2. Relationalstatistics fall into one of three categories: univariate...

Words: 4590 - Pages: 19

Free Essay

Effectiveness Analysis of an Imc Plan – Analysis on Djuice.

...|Effectiveness analysis of an IMC plan – analysis on Djuice. | |Research Report | | | | | Table of Contents Contents Executive Summary 3 Background 4 Statement of the Problem 11 Approach to the Problem 12 Research Design 14 Data Analysis 15 Results 16 Limitation and Caveats 21 Conclusion and Recommendations 22 Exhibit 23 Reference 27 Executive Summary I am going to conduct a research project on “Effectiveness analysis of an IMC plan – analysis on DJUICE”. Integrated marketing communication is integration of all marketing tools, approaches, and resources within a company which maximizes impact on consumer mind and which results into maximum profit at minimum cost. It aims to ensure consistency of message and the complementary use of media. To be an Effective brand IMC plan plays a major role. Effectiveness of a brand is measured by consumer preference. Advertising and other promotional tools, Word of Mouth, Service quality, Tariff and Offer etc are the variables for preferring a mobile phone. The objective is to identify the effect of each factor on preference of Djuice. This research project could act as the guideline to estimate what the Djuice users expect from the company and what......

Words: 3826 - Pages: 16