Statistical processing
Contact info
Science, Technology and Culture, Business StatisticsAnne-Sofie Dam Bjørkman
+45 20 37 54 60
Get as PDF
Data for this statistics are collected via questionnaires for app. 3,600 respondents among a population of app. 20,000 enterprises. The material is validated already during the response from the enterprise, and afterwards followed by computer-aided validation and manual validation. Imputations and calibrated weighting is also a part of the treatment of data.
Source data
The statistics are compiled on the basis of questionnaires collected from app. 3600 enterprises drawn as a sample from a population of app. 20,000 enterprises. The statistics are collected as one part of a single questionnaire, that also covers enterprises' research and development (R&D). The enterprises are sampled depending on the number of full-time equivalents and type of activity (NACE). All enterprises with 100 or more full-time equivalents are included in the sample, and the likeliness of being chosen for the sample decreases in line with decrease in number of full-time equivalents. The probability of selection is higher within types of activities that are more R&D-intensive than within activities where R&D is less frequent. The enterprises in the sample are randomly selected. From the reference year 2009 the sample is designed as a 'rolling panel', which reduces the measurement uncertainty of the statistics.
Frequency of data collection
Yearly.
Data collection
The statistics are collected via http://www.virk.dk as an electronic questionnaire.
Data validation
A comprehensive validation of the data is carried out: In the electronic questionnaire validation is performed on a range of the variables, e.g. on totals. If the total entered by the respondent does not match the calculated total, the respondent will be presented to this, and has the opportunity to correct the total or one or more of the components. The same applies if a calculation in the questionnaire has to sum up to 100 per cent, and this is not the case. If the levels of some of the key data typed in by the respondent are much higher or lower than the previous year, the respondent will be notified, and has the opportunity to correct if necessary. This applies e.g. to R&D-full-time equivalents and R&D expenses. After the data collection the data are mechanically validated and to some extent corrected. The ICT-programs that checks the data for errors also forms lists of likely or de facto errors. The types of errors that are identified as those having the greatest influence on the quality of the statistics are listed together with identification numbers of the respondents. This list is checked manually. Finally outlier tests are carried out for key variables/combinations of these. A minor part of the data collected is compared to other sources with the aim of assessing whether the response is likely correct or should be corrected. This applies to e.g. the number of R&D full-time equivalents, which is compared to the total number of full-time equivalents in the enterprise, which comes from The Central Business Register. The total expenditure for innovation, including expenses for own R&D are compared to the total turnover of the enterprise, which also comes from The Central Business Register. Also public accounts from the enterprises are used as a supplying source of information.
Data compilation
The final, corrected data material is compared to the original sample. Enterprises above a certain size, that have not responded to the questionnaire, will have their response imputed, either by using the data collected from the respondent in the previous year, or via cold-deck. A calibrated weighting is carried out.
Adjustment
Not relevant for these statistics.