This point holds especially true for the project I am currently working on, which has only a simple email contact form. I do not want a valid, human client to have to tackle entering a captcha image or solving a math problem just o send the company a contact. So I set about reading about ways to catch spam on the server side.
The solution that i’ve come up with marries two fairly old concepts together with some fairly simple logic (the success of this is untested, however I think it should be fairly effective – time will tell.. and i’ll keep people posted here)
The two techniques I will be using are: timestamping and honeypots.
So what are they?
The concept of timestamping involves sending a timestamp to the form and storing it in a hidden field, effectively allowing us to know when the form was rendered. This is useful for two reasons:
1) We can check that the user spent a reasonable amount of time filling out hte form. The theory here is that the user is unlikely to instantly post the form, however a spam bot that has scraped the data down willvery quickly be able to scrape out the form fields and post the page. So: for example in this current project, my contact form is asking for a name, email, subject and message. A human cant type these into a form in less than say 5-10 seconds, so when the form is submitted i can check the timestamp in the hidden field against the current time and thus reject a submission that is too rapid.
2) We can force the form to expire after say 1 hour, provide the user with a simple message that says something like “sorry, the form has expired.. please try again”.
Obviously the too fast time would depend on the size of the form that you are requiring them to fillout.
The secret here also is to NOT redirect to the form page, but to store the form values they have created into session, so that we can provide a link saying click here to re-submit the form, and explaining that the client needs to wait 10 seconds before sending the form.
Honeypots are used to trick spambot engines into telling us they are a spam bot. Basicallly the logic is as follows: a spambot tries to guess the values for the inputs in a form when it submits it. SO: if we use CSS to completely hide a honeypot text field, then the spambot is likely to try and guess a value for that field – then we assume that any submission with a value in that field is a bot, and display a message suggesting as much, if they are not a bot, we allow them to click a button to submit the form – this should in theory prevent a bot successfully submitting the form.
Time will tell how well these two techniques will work together to cut back on spam. If anyone is interested, i could probably tidy up the django code and release it as an app for people to play with…