Check me, replicate me: Corrections & replications of my work

Please help me do better science!

Jump to: What I offer / I’ll pay you / Corrections of my work / Replications of my work / Questions about my work / Related readings & initiatives / Others who took the pledge

You can report errors using this website’s Contact Form.

Background

Since embarking on my open-science journey in 2017-8 I’ve been trying to be as open as I can about everything that I do, hoping that an open scientific process would help ensure that my work is solid and without any errors and mistakes. But for that we need a community, not a few peer reviewers. I, we, need your help in ensuring the validity of my work. I have been trying to promote replications and the validation of the published literature, and I would hope that others would do the same for me, by checking and replicating my work.

I really appreciate those who take out of their time and resources to help others check their work, revisit and assess preprints and articles, and replicate published/preprinted findings.

I would be honored to have you look into my work.

I talk about and explain this pledge and webpage in the following talk:

“Check me, replicate me” pledge - Promoting collaborative replications, assessments, & corrections

Watch this video on YouTube.

What I offer

I try and share whatever I can from the projects I run, yet if somehow something needed is missing, please reach out to me. I would do what I can to provide you with the needed information.
I would be happy to review replications of my work, and provide feedback, providing the obvious academic time pressures.
Not a must, but if you wish, I would be very glad to post about this, and credit you. If at some point it will ever get to an official correction, I would like to give you the credit for the important work you did to help me and the community do better.
I don’t have grants and funding, but I will commit out of my own money to pay you for the mistakes you find. Terms and conditions below.
I don’t have grants and funding, but I will do what I can to try and support replications of my own work. If I happen to have a budget for data collection, I will try and support your replication’s data collection. If I can, I would love to help you in seeking out funding to replicate my work.

Found a mistake in my work? Great! Thank you!
and… I’ll pay you

This idea is adopted from Science Fiction’s corrections bounty, who based this on many others. I think this is something all researchers should adopt.

I’ll pay you if you find minor or major objective errors in my preprinted and published work.

Minor errors would include reporting a numerical result incorrectly. This doesn’t include typos or grammatical errors unless they substantially change the intended meaning of what I’ve written. I’ll pay you 5 US$ if you find a minor error.
Major errors would include a major reporting problem where I analyzed or reported something incorrectly, such that there is an error in my methods analysis, there is an error in my code, or that the conclusions I draw are different than the conclusions warranted from the data. I’ll pay you 50 US$ if you find a major error.

You can report errors using this website’s Contact Form.

The rules:

These have to be objective errors – not matters of opinion or interpretation. If you just disagree with something I wrote in an article, feel free to tell me, or write a commentary/review, but it won’t warrant payment.
Please send me as many details of the error as possible: which article, what page, what code, which file, etc. and please include a detailed explanation of what’s wrong.
Obviously, the error has to be something that isn’t already listed on this page.
In most cases I’ll decide myself whether it’s an error. If you insist that there’s a mistake and I don’t agree, we might ask a third-party arbiter to decide who’s correct. If you don’t end up convincing me, it’s not like you lose any money.
Although you can send me an anonymous or pseudonymous message, if I agree you’ve found an error, you’ll have to send me a way to pay you, which might involve revealing your identity. We can discuss the specifics when you get in touch.
I’ll post your name or pseudonym (or just “[anonymous]” if you wish) alongside the correction on this page.
If you don’t want the money yourself, I’ll donate it to charity and send you the receipt. By default I’ll give to Givewell’s maximum impact fund, supporting the effective altruism movement.
I’ll keep paying out for errors until I’ve paid 2000 US$ (to be revised if I ever get any grant or funding to support this activity). I will alert issues in advance if I ever get close to this limit.

List of corrections of my work

Article: Feldman, G., & Wong, K. F. E. (2018). When action-inaction framing leads to higher escalation of commitment: A new inaction-effect perspective on the sunk-cost fallacy. Psychological science, 29(4), 537-548. DOI: 10.1177/0956797617739368 [Article] [Preprint] [Open materials/data/code] [Open access].

Issues (added to the OSF page WIKI):

Mini meta analysis aggregate of post-intervention effect in Experiment 4:
- The mini-meta analysis of the effects included the post-intervention effect in Experiment 4. For a more accurate mini-meta, we should have used the pre-intervention effect.
- No one reported this. I noticed this when going over the project/code years later (2021).
Wrong t-values
- Reported April 21, 2022, by a group in Tilburg doing reproductions: Rick Klein and Jelte Wicherts.
- In page 544 we wrote: “The participants in the escalation-as-action condition recorded the highest anticipated joy rating for successful escalation (M = 90.72, SD = 14.47), whereas those in the de-escalation- as-action condition recorded the lowest (M = 80.09, SD = 24.21), t(150) = 122.51, p = .001, mean difference = 10.63, 95% CI = [4.24, 17.02], d = 0.54)”
- The t-values t(150) = 122.51 are wrong and should be 3.29.
- The 122.51 is likely the degrees of freedom from the Welch t-test, and this is a result of bad copy-paste from the Rmarkdown output.
- This is categorized as a minor issue, given that it does not change any of our core conclusions.
- 5USD have been rewarded for finding and reporting this mistake, matched by 5USD and donated to Givewell.
- Posted on Pubpeer on April 26, 2022.

Article: Kutscher, L., & Feldman, G. (2019). The impact of past behavior normality on regret: Replication and extension of three experiments of the exceptionality effect. Cognition and Emotion, 33, 901-914. DOI: 10.1080/02699931.2018.1504747 [Article] [Preprint] [Open materials/data/code]

Issues (added to the OSF page WIKI):

Qualtrics third page extension reminder mistake
1. In a workshop I gave on 26/09/2023 I used the Qualtrics survey from the materials from the replication article, and asked the students to add extensions.
2. Credit: Found and raised by Team “Grey”: Imogen YIH; Wing Sze (Rachelle) LI; Jade TIN; Kwan Lok (Kelvin) TANG; Wing Yee (Sandy) CHENG.
3. They found that in the first run of the replication, in two of the three conditions – the last reminder on the third page of the extension, actually does not reflect the right condition, but rather the reminder just repeats the routine condition.
4. Since this is the last screen, it relates only to the extension DV appearing at the end.
5. In that article we reran that specific experiment a second time, where the issue was addressed and without comprehension checks. The results are nearly identical across the first and the second run, so it seems that participants in the first run just did not pay attention to this glitched reminder.
6. This did not seem to severely impact the results or conclusions, so I categorized it as “minor”.
  1. It’s possible that the strong comprehension checks helped with this being ignored.
  2. There is a clear indication that participants did not pay much attention to this oversight, because we were able to see a clear effect for regret, and results were nearly identical to the second corrected run.
  3. It is possible that the effect would have been bigger if the reminder would have been correct, though we find no indication for that in the second run.
7. 5USD * 5 team members (25USD) have been rewarded to the team for finding and alerting me to this glitch. They have indicated they decided to donate it, so I matched it and donated 50USD to Givewell Top Charities Fund on 28/09/2023 (Confirmation Code: 406159).
8. I added an updated Qualtrics to the OSF with the issue corrected, so that others building on our work would hopefully not be affected, and updated the project OSF WIKI.

Article: Feldman, G., Farh, J., & Wong, K. F. E. (2018). Agency beliefs over time and across cultures: Free will beliefs predict higher work satisfaction. Personality and Social Psychology Bulletin, 44, 304–317. DOI: 10.1177/0146167217739261. [Article] [Preprint] [Open materials/data/code]

Issues (added to the OSF page WIKI):

Corrections to this work:
1. Table 6 on page 312 has 0.35 for Study 1 Time 1, however the correlation is 0.36. This is a mistake, may have been due to rounding issues with SPSS versus R, I’m not sure.
2. This may have also affected the mini-meta.
  Rerunning the mini-meta analyses, the corrected aggregated mini meta effects are:
  Time 1 – 0.28 [0.19, 0.38] instead of the current 0.29 [0.19, 0.39]
  Time 2 – 0.24 [0.12, 0.36] instead of the current 0.25 [0.13, 0.38]
This seemed rather minor and did not impact any of the conclusions, and I therefore I categorized it as “minor”.
60HKD was awarded to the student who raised it in HKU Research Methods RPg class on October 24, 2023.
I added an updated Qualtrics to the OSF with the issue corrected, so that others building on our work would hopefully not be affected, and updated the project OSF WIKI.

Article: Ziano, I., Yeung, S., Cheong, S., Shi, J., & Feldman, G. (2023). “The Effort Heuristic” revisited: Mixed results for replications of Kruger et al. (2004)’s Experiments 1 and 2. Collabra:Psychology, 9 (1): 87489. https://doi.org/10.1525/collabra.87489
[Article] [Preprint] [Open materials/data/code]

Corrections to this work:
1. On January 2024, Adrien Fillon reported a minor ambiguity with the following reported statistic:
  
  “Bold” (M = 8.23, SD = 1.71) was rated higher than “The Flirt” (M = 7.96, SD = 1.66), F(1, 600) = 16.99, p < .001, d = 0.17, 95% CI [0.09, 0.25].
  
  Which reports a Cohen’s d for an ANOVA F test.
  
  We traced this to the “Kruger et al. (2004) Data Analysis After-Exclusion Experiment 2 Kit” file: https://osf.io/jxmw7 And the effect reported is based on a supplementary paired samples t-test: t(601) = 4.08, p < .001, d = 0.17, 95% CI [0.09, 0.25].
  
  We are unsure why we reported a Cohen’s d here, and it would have been preferable to report the following with partial ETA squared:
  
  “Bold” (M = 8.23, SD = 1.71) was rated higher than “The Flirt” (M = 7.96, SD = 1.66), F(1, 600) = 16.99, p < .001, η2p = 0.03, 90% CI [0.01, 0.05].
This seemed rather minor and did not impact any of the conclusions, and I therefore I categorized it as “minor”
We thanked Adrien Fillon for his reporting, and given the “Check me, replicate me” pledge, Adrien decided to donate the awarded USD5, which we matched, and donated USD10 overall to Givewell Top Impact Fund.
I updated the project OSF WIKI.

Petrov, N., *Chan, Y., *Lau, C., *Kwok, T., *Chow, L., *Lo, W., Song, W., & ^Feldman, G. (2023). Comparing time versus money in sunk cost effects: Replication Registered Report of Soman (2001). International Review of Social Psychology, 36(1): 17, 1–18.
https://doi.org/10.5334/irsp.883
[Article] [Preprint] [Open materials/data/code]

Corrections to this work:
1. Nikolay, lead author, reported on May 15, 2024: “We concluded mixed support: a) sunk time cost effect was stronger than the sunk money cost in Study 1”, “stronger” should be “weaker”. Reported on PubPeer
2. This seemed rather minor and did not impact any of the conclusions, and I therefore I categorized it as “minor”
3. I thanked Nikolay Petrov for his reporting, and given the “Check me, replicate me” pledge, we donated USD10.30 overall to Givewell Top Impact Fund.
4. I updated the project OSF WIKI.

You can report errors using this website’s Contact Form.

List of replications of my work

I don’t know of many replications of my work, I am just an early career researcher, but I’ll list whatever little I do know. Both successful and failed.

Article: Feldman, G., Chao, M. M., Farh, J. L., & Bardi, A. (2015). The motivation and inhibition of breaking the rules: Personal values structures predict unethicality. Journal of Research in Personality, 59, 69-80.

Replication #1:

Article: Ring, C., Kavussanu, M., & Gürpınar, B. (2020). Basic values predict doping likelihood. Journal of sports sciences, 38(4), 357-365.
Type: Conceptual
Findings:
- Replication shape consistency=.92, their 2^nd replication shape consistency = .91. This seems inline with our values-ethicality shape consistency =.88.
- Replication correlations: .29,-.21,.11,-.22. This seems inline with our values-ethicality meta correlations: .31,-.25,.20,-.26.

Article: Feldman, G., & Albarracín, D. (2017). Norm theory and the action-effect: The role of social norms in regret following action and inaction. Journal of Experimental Social Psychology, 69, 111-120.

Replication #1:

Article: Regret and Disappointment are Differentially Associated with Norm Compliance and Norm Deviant Failures (link to poster abstract). I was a reviewer on their submission to JESP, which for reasons I’m not sure I understand was rejected.
Type: Conceptual
Findings: Successful replication. Hope it gets published/preprinted so I can link to it and the findings.

Article: Ziano, I., Li, J., Tsun, S., Lei, H., Kamath, A., Cheng, B., & Feldman, G. (2021). Revisiting “money illusion”: Replication and extension of Shafir, Diamond, and Tversky (1997)‎. Journal of Economic Psychology, 83, 102349. DOI: 10.1016/j.joep.2020.102349

Replication #1:

A Brazilian team reported that they took our survey as is, translated to Portuguese and ran this with a Brazilian sample, and the results were very similar to our replication and the Shafir et al (1997).
Type: Direct (same stimuli, but in Portuguese using Brazilian sample)
Findings: Very similar. Preprint: https://osf.io/preprints/psyarxiv/fh597 ; Data/code available on: https://osf.io/48pqu/

Article: Chen, J., Hui, L.S., Yu, T., Feldman, G., Zeng, S., Ching, T., Ng, C., Wu, K., Yuen, C., Lau, T., Cheng, B., & Ng, K. (2021). Foregone opportunities and choosing not to act: Replications of Inaction Inertia effect. Social Psychological and Personality Science, 12(3) 333-345.
DOI: 10.1177/1948550619900570.

Replication #1:

A Brazilian team reported that they took our survey as is, translated to Portuguese and ran this with a Brazilian sample, and the results were very similar to our replication and the original article.
Type: Direct (same stimuli, but in Portuguese using Brazilian sample)
Findings: Very similar. Preprint: https://psyarxiv.com/4u98x/ ; Data/code: https://osf.io/62nxb/

Article: Chandrashekar, S. P., Yeung, S., Yau, K., Cheung, C., Agarwal, T. K., Wong, C., Pillai, T., Thirlwell, T. N., Leung, W., Li, Y., Tse, C., Cheng, B., Chan, H., & Feldman, G. (2021). Agency and self-other asymmetries in perceived bias and shortcomings: Replications of the Bias Blind Spot and extensions linking to free will beliefs. Judgment and Decision Making, 16(6), 1392-1413. DOI: 10.17605/OSF.IO/3DF5S

Replication #1:

A Brazilian team reported that they took our survey as is, translated to Portuguese and ran this with a Brazilian sample, and the results were very similar to our replication and the original article.
Type: Direct (same stimuli, but in Portuguese using Brazilian sample)
Findings: Very similar. Open access article (Collabra): https://doi.org/10.1525/collabra.122158; Data/code available on: https://osf.io/c9d2a/

Reproduction #1 (with extensions and new analyses):

A student team re-analyzed the dataset shared from this article resulting in the report – “How Believing in Free Will is Enough to Be
More Satisfied at Work” by Sami El Sabri and Liban Timir (report pdf / report quarto on github / project github)

List of questions about my work

These are issues that were raised about some of my work.

Feldman, G., Lian, H., Kosinski, M., & Stillwell, D. (2017). Frankly, we do give a damn: The relationship between profanity and honesty. Social psychological and personality science, 8(7), 816-826.

Questions were raised regarding two aspects of this article:

The use of the so-called “lie-scale” as a measure of “honesty” in Study 1.
1. A commentary was posted on the article, to which I wrote a response, which received another preprinted commentary, and a follow-up study.
2. A blog post shared some of the remarkable exchange between the commenters and the editorial team.
3. I’m puzzled by what the group of commentary authors did here, their published motivations (alternate), scientific claims and evidence raised in the commentaries. Instead of the commentaries with the vague indirect evidence and broad claims (with comments about me in the email exchanges), the more recent follow-up study was the right thing to do to begin with. This is how science should progress – by providing high-quality pre-registered open rigorous evidence. This article seems like a promising first step.
  There are a still lot of things that confuse me about this research domain, the commentaries, and the claims made, I write some of that in my response. As an early career researcher at the time, I can generally comment that the exchange with this group did not feel scientific, and I hope that future academic scientific exchanges can be more evidence based, open, positive, and constructive.
  The commentary group seems to think all issues have been resolved long ago. Maybe, I think there are still many unresolved questions in this domain, the lie-scale is self-explanatory and folk interpretations seem pretty straight-forward. That said, this is not my research domain, I am not an expert.
  I hope others will use these as a starting point for understanding this tool better. I think humility is warranted here, until this is sorted out.
  For now, I am grateful for the interest that my research sparked, and I suggest caution in interpreting these findings.
4. [My most up-to-date understanding of this is reflected in a 2021 meta-analysis summarizing: “Unlike both possibilities, the meta-analytic correlation between SD scores and prosocial behavior was close to zero, suggesting that SD scales neither clearly measure bias nor substantive traits. This conclusion was also supported by moderation analyses considering differences in the implementation of games and the SD scales used.”]
The use of the dishonesty measure based on a linguistic analysis of text.
1. In retrospect, this is weak. I based this analysis on published literature, was a bit naïve to accept it as it, and I should have spent more time re-validating this tool. I was not critical enough to recognize its possible weaknesses. I now doubt that this measure/tool really is able to detect dishonesty in text. Linguistic detection of dishonesty is yet another domain that need re-evaluation and sorting out, and is not my research domain and my expertise and ability to contribute to this domain is rather limited.
Use of MyPersonality dataset
1. I have tried since 2017 to share all of my data openly to allow reproducibility to assess my data and analyses. However, in this study I used MyPersonality, a dataset that has been used for a lot of publications, and that using this dataset required inviting the data collectors as coauthors. The main issue here was that I couldn’t share the raw data used here, and instead referred to their WIKI for others to download the datasets. However, as their notice explains, following the Cambridge Analytica events they decided to take the whole dataset down, which means that there is no longer public access to this data to allow reproducibility. That’s a shame. I’ve learned from this collaboration for open-science – I then decided I will no longer collaborate on using a dataset that I can’t publicly share to reproduce all analyses for a study.

In addition, there were many misunderstanding about this article and our conclusions in the media. I refused to take part in this media blowup other than referring to my posted clarifications on my blog.
Bottom line, this was a first step study hoping to spur follow up research, and – again – humility is warranted here. We need far more research before we can draw any conclusions here.

My own concerns

Feldman, G., Chandrashekar, S. P., & Wong, K. F. E. (2016). The freedom to excel: Belief in free will predicts better academic performance. Personality and Individual Differences, 90, 377-383.

I was concerned when faced with the p-values that came out of the studies reported in this article. Some of the p-values in this article are just below 0.05, and though there were no exclusions of participants, and no peeking, these studies were not pre-registered and the p-values that felt problematic and possibly indicated issues with flexibility of data analysis. I felt like this should be shared and communicated, but was concerned about how solid this is.

In Study 1, there might be a confound of background and English proficiency, so would be best to rerun this with a more homogeneous sample and if possible to test that they have similar English skills. In Study 2, though this reports the complete sample as I received it from the university, the effects and p-values for some of what I reported seem very weak, especially given the large sample.

In retrospect, I wish I had done much better here. I should have probably run replications of this work before proceeding to a submission. This would have been better off as a preprint awaiting further data. This became especially clear when the review process at PAID turned out to be inadequate, and when the media started reporting on this paper, overinterpreting the results (assuming causality, etc.). Even if the evidence here is solid, which remains to be determined, we are talking about variance explained of ~1%, and correlational designs

At the very least I suggest strong caution and much more humility in interpreting the results in this article, and I really hope for replications of both studies. Do you know of successful or failed replications of this work? please do share and let me know.

I am not sure why, but this seems to be one of my most highly cited papers.

Feldman, G., Chao, M. M., Farh, J. L., & Bardi, A. (2015). The motivation and inhibition of breaking the rules: Personal values structures predict unethicality. Journal of Research in Personality, 59, 69-80.

I think the correlational mini meta of many samples we did in this article’s Study 1 is pretty solid, large samples, lots of measures, and with some initial replications that give me confidence about those findings.

Yet, given the concerns regarding the profanity-honesty article Study 2 using indirect linguistic deception detection, we used a similar method here in Study 3, so, I’ll repeat what I wrote above:

“In retrospect, this is weak. I based this analysis on published literature, was a bit naïve to accept it as it, and I should have spent more time re-validating this tool. I was not critical enough to recognize its possible weaknesses. I now doubt that this measure/tool really is able to detect dishonesty in text. Linguistic detection of dishonesty is yet another domain that need re-evaluation and sorting out, and is not my research domain and my expertise and ability to contribute to this domain is rather limited.”

Study 2’s method seems very context dependent to that point of time for that status of MTurk, so I wish I used more robust methods that could remain relevant for longer.

Put together, I think the link to unethicality self-reports seems pretty solid, but we need to revisit and redo the link between values and real life unethicality using better more robust methods. I’d love to see others look further into that link, and happy to help with that.

Other people/initiatives doing related things

Bounties on catching own mistakes:

Ruben C. Arslan

Updates to one’s own work

Joshua Knobe: Responses

Readings:

Rohrer, J. M., Tierney, W., Uhlmann, E. L., DeBruine, L. M., Heyman, T., Jones, B., … & Yarkoni, T. (2021). Putting the self in self-correction: Findings from the Loss-of-Confidence Project. Perspectives on Psychological Science, 16(6), 1255-1269.
Altenmüller, M. S., Nuding, S., & Gollwitzer, M. (2021). No harm in being self-corrective: Self-criticism and reform intentions increase researchers’ epistemic trustworthiness and credibility in the eyes of the public. Public understanding of science, 30(8), 962-976.
Ramsey, R. (2021). A call for greater modesty in psychology and cognitive neuroscience. Collabra: Psychology, 7(1), 24091.
Hoekstra, R., & Vazire, S. (2021). Aspiring to greater intellectual humility in science. Nature human behaviour, 5(12), 1602-1607.

Others who took the pledge

I am hoping that others will follow and make similar pledges, to make this mainstream.
Here are others I know of who did, if you know of others please do let me know:

Gilad Feldman

Decision Making, Social Psychology, & Open Science

Check me, replicate me: Corrections & replications of my work

Background

What I offer

Found a mistake in my work? Great! Thank you!
and… I’ll pay you

List of corrections of my work

List of replications of my work

List of questions about my work

My own concerns

Other people/initiatives doing related things

Others who took the pledge

Background

What I offer

Found a mistake in my work? Great! Thank you! and… I’ll pay you

List of corrections of my work

List of replications of my work

List of questions about my work

My own concerns

Other people/initiatives doing related things

Others who took the pledge

Found a mistake in my work? Great! Thank you!
and… I’ll pay you