skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Project Success Prediction in Crowdfunding Environments
Crowdfunding has gained widespread attention in recent years. Despite the huge success of crowdfunding platforms, the percentage of projects that succeed in achieving their desired goal amount is only around 40%. Moreover, many of these crowdfunding platforms follow "all-or-nothing" policy which means the pledged amount is collected only if the goal is reached within a certain predefined time duration. Hence, estimating the probability of success for a project is one of the most important research challenges in the crowdfunding domain. To predict the project success, there is a need for new prediction models that can potentially combine the power of both classification (which incorporate both successful and failed projects) and regression (for estimating the time for success). In this paper, we formulate the project success prediction as a survival analysis problem and apply the censored regression approach where one can perform regression in the presence of partial information. We rigorously study the project success time distribution of crowdfunding data and show that the logistic and log-logistic distributions are a natural choice for learning from such data. We investigate various censored regression models using comprehensive data of 18K Kickstarter (a popular crowdfunding platform) projects and 116K corresponding tweets collected from Twitter. We show that the models that take complete advantage of both the successful and failed projects during the training phase will perform significantly better at predicting the success of future projects compared to the ones that only use the successful projects. We provide a rigorous evaluation on many sets of relevant features and show that adding few temporal features that are obtained at the project's early stages can dramatically improve the performance.  more » « less
Award ID(s):
1527827
PAR ID:
10021821
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the Ninth ACM International Conference on Web Search and Data Mining
Page Range / eLocation ID:
247 to 256
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The often fierce competition on crowdfunding markets can significantly affect project success. While various factors have been considered in predicting the success of crowdfunding projects, to the best knowledge of the authors, the phenomenon of competition has not been investigated. In this paper, we study the competition on crowdfunding markets through data analysis, and propose a probabilistic generative model, Dynamic Market Competition (DMC) model, to capture the competitiveness of projects in crowdfunding. Through an empirical evaluation using the pledging history of past crowdfunding projects, our approach has shown to capture the competitiveness of projects very well, and significantly outperforms several baseline approaches in predicting the daily collected funds of crowdfunding projects, reducing errors by 31.73% to 45.14%. In addition, our analyses on the correlations between project competitiveness, project design factors, and project success indicate that highly competitive projects, while being winners under various setting of project design factors, are particularly impressive with high pledging goals and high price rewards, comparing to medium and low competitive projects. Finally, the competitiveness of projects learned by DMC is shown to be very useful in applications of predicting final success and days taken to hit pledging goal, reaching 85% accuracy and error of less than 7 days, respectively, with limited information at early pledging stage. 
    more » « less
  2. Offering products in the forms of menu bundles is a common practice in marketing to attract customers and maximize revenues. In crowdfunding platforms such as Kickstarter, rewards also play an important part in influencing project success. Designing rewards consisting of the appropriate items is a challenging yet crucial task for the project creators. However, prior research has not considered the strategies project creators take to offer and bundle the rewards, making it hard to study the impact of reward designs on project success. In this paper, we raise a novel research question: understanding project creators’ decisions of reward designs to level their chance to succeed. We approach this by modeling the design behavior of project creators, and identifying the behaviors that lead to project success. We propose a probabilistic generative model, Menu-Offering-Bundle (MOB) model, to capture the offering and bundling decisions of project creators based on collected data of 14K crowdfunding projects and their 149K reward bundles across a half-year period. Our proposed model is shown to capture the offering and bundling topics, outperform the baselines in predicting reward designs.We also find that the learned offering and bundling topics carry distinguishable meanings and provide insights of key factors on project success. 
    more » « less
  3. Abstract Projects that pay communities or individuals to conserve natural areas rarely continue indefinitely. When payments cease, the behaviors they motivate can change. Previous research on conservation-based payments recognizes the impermanence of conservation success, but it does not consider the legacy of payments that failed to effect change. This research assesses impermanence and failure by investigating the legacy of village-level conservation payments made through one of the largest Integrated Conservation and Development Projects in Indonesia. The Kerinci-Seblat Integrated Conservation and Development Project aimed to conserve forest area and promote local development through voluntary conservation agreements (VCAs) that provided payments for pro-conservation pledges and activities from 2000 through 2003. Project documentation and previous research find that payments failed to incentivize additional forest conservation, producing nonsignificant differences in forest-cover change during the project period. To examine the legacy of these payments in the post-project period, this research uses matched difference-in-differences and triple differences models to analyze forest cover change in villages (n= 263) from 2000 through 2016 as well as matched binary logistic regression models to assess enduring differences in household (n= 1303) livelihood strategies within VCA villages in 2016. The analysis finds that VCA villages contained significantly more forest loss than the most similar non-VCA villages outside the national park, and greater payments predict increased forest loss in the post-project period. In addition, farming high-value tree crops and cultivating private land were the most important attributes for modeling VCA affiliation among randomly selected households. These results demonstrate that, after payments ceased, project failures increased in severity over time.Those who design and implement conservation-based payments bear great responsibility to ensure their projects are informed by local voice, align with community preferences, and provide sufficient benefits, lest they result in a conservation legacy of increased failure. 
    more » « less
  4. We introduce a novel check-in time prediction problem. The goal is to predict the time a user will check-in to a given location. We formulate check-in prediction as a survival analysis problem and propose a Recurrent-Censored Regression (RCR) model. We address the key challenge of check-in data scarcity, which is due to the uneven distribution of check-ins among users/locations. Our idea is to enrich the check-in data with potential visitors, i.e., users who have not visited the location before but are likely to do so. RCR uses recurrent neural network to learn latent representations from historical check-ins of both actual and potential visitors, which is then incorporated with censored regression to make predictions. Experiments show RCR outperforms state-of-the-art event time prediction techniques on real-world datasets. 
    more » « less
  5. Chiappa, Silvia; Calandra, Roberto (Ed.)
    Random forests are powerful non-parametric regression method but are severely limited in their usage in the presence of randomly censored observations, and naively applied can exhibit poor predictive performance due to the incurred biases. Based on a local adaptive representation of random forests, we develop its regression adjustment for randomly censored regression quantile models. Regression adjustment is based on a new estimating equation that adapts to censoring and leads to quantile score whenever the data do not exhibit censoring. The proposed procedure named censored quantile regression forest, allows us to estimate quantiles of time-to-event without any parametric modeling assumption. We establish its consistency under mild model specifications. Numerical studies showcase a clear advantage of the proposed procedure. 
    more » « less