Copy and paste programming
Encyclopedia
Copy and paste programming is a pejorative term to describe highly repetitive computer programming
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...

 code apparently produced by copy and paste operations. It is frequently symptomatic of a lack of programming competence, or an insufficiently expressive development environment, as subroutines or libraries would normally be used instead.

Plagiarism

Copy and pasting is often done by inexperienced or student programmers, who find the act of writing code from scratch difficult and prefer to search for a pre-written solution or partial solution they can use as a basis for their own problem solving.
(See also Cargo cult programming
Cargo cult programming
Cargo cult programming is a style of computer programming that is characterized by the ritual inclusion of code or program structures that serve no real purpose...

)

As a way of applying library code

Copy and pasting is also done by experienced programmers, who often have their own libraries of well tested, ready-to-use code snippets and generic algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

s that are easily adapted to specific tasks.

As a way of branching code

Branching code
Branching (software)
Branching, in revision control and software configuration management, is the duplication of an object under revision control so that modifications can happen in parallel along both branches....

 is a normal part of large-team software development, allowing parallel development on both branches and hence, shorter development cycles. Classical branching has the following qualities:
  • Is managed by a version control
    Revision control
    Revision control, also known as version control and source control , is the management of changes to documents, programs, and other information stored as computer files. It is most commonly used in software development, where a team of people may change the same files...

     system that supports branching
  • Branches are re-merged once parallel development is completed.


Copy and paste is a less formal alternative to classical branching, often used when it is foreseen that the branches will diverge more and more over time, as when a new product is being spun off from an existing product.

As an approach to repetitive tasks

One of the most harmful forms of copy-and-paste programming occurs in code that performs a repetitive task. Each repetition is copied from above and pasted in again, with minor modifications. Harmful effects are discussed below.

Deliberate Design Choice

Use of programming idioms and design patterns
Design pattern (computer science)
In software engineering, a design pattern is a general reusable solution to a commonly occurring problem within a given context in software design. A design pattern is not a finished design that can be transformed directly into code. It is a description or template for how to solve a problem that...

 are distinct from copy and paste programming, as they are expected to be recalled from the programmer's mind, rather than retrieved from a code bank.

There is research aimed at "decriminalizing" cut and paste, known as the Subtext programming language
Subtext programming language
Subtext is a moderately visual programming language and environment, for writing application software. It is an experimental, research attempt to develop a new programming model, called Example Centric Programming, by treating copied blocks as first class prototypes, for program structure...

. Note that under this model, cut and paste is the primary model of interaction and hence not an anti-pattern.

Specific to Plagiarized Code

  • Inexperienced programmers who copy code often do not fully understand the pre-written code they are taking. As such, the problem arises more from their inexperience and lack of courage than from the act of copying and pasting, per se. The code often comes from disparate sources such as friends' or co-workers' code, Internet forum
    Internet forum
    An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are at least temporarily archived...

    s, code provided by the student's professors/TAs, or computer science
    Computer science
    Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

     textbooks. The result risks being a disjointed clash of styles, and may have superfluous code that tackles problems for which solutions are no longer required.
  • Bugs can also easily be introduced by assumptions and design choices made in the separate sources that no longer apply when placed in a new environment.
  • Such code may also, in effect, be unintentionally obfuscated
    Obfuscated code
    Obfuscated code is source or machine code that has been made difficult to understand for humans. Programmers may deliberately obfuscate code to conceal its purpose or its logic to prevent tampering, deter reverse engineering, or as a puzzle or recreational challenge for someone reading the source...

    , as the names of variables, classes, functions, etc., are normally left unchanged, even though their purpose may be completely different in the new context than it was in the original context.

As a way of applying library code

  • Being a form of code duplication, copy and paste programming has some intrinsic problems; such problems are exacerbated if the code doesn't preserve any semantic link between the source text and the copies. In this case, if changes are needed, time is wasted hunting for all the duplicate locations. (This can be partially mitigated if the original code and/or the copy are properly commented; however, even then the problem remains of making the same edits multiple times. Also, because code maintenance often omits updating the comments, comments describing where to find remote pieces of code are notorious for going out-of-date.)
  • Adherents of object oriented methodologies further object to the "code library" use of copy and paste. Instead of making multiple mutated copies of a generic algorithm, an object oriented approach would abstract
    Abstraction (computer science)
    In computer science, abstraction is the process by which data and programs are defined with a representation similar to its pictorial meaning as rooted in the more complex realm of human life and language with their higher need of summarization and categorization , while hiding away the...

     the algorithm into a reusable encapsulated
    Information hiding
    In computer science, information hiding is the principle of segregation of the design decisions in a computer program that are most likely to change, thus protecting other parts of the program from extensive modification if the design decision is changed...

     class
    Class (computer science)
    In object-oriented programming, a class is a construct that is used as a blueprint to create instances of itself – referred to as class instances, class objects, instance objects or simply objects. A class defines constituent members which enable these class instances to have state and behavior...

    . The class is written flexibly, with full support of inheritance
    Inheritance (computer science)
    In object-oriented programming , inheritance is a way to reuse code of existing objects, establish a subtype from an existing object, or both, depending upon programming language support...

     and overloading
    Method overloading
    Function overloading or method overloading is a feature found in various programming languages such as Ada, C#, VB.NET, C++, D and Java that allows the creation of several methods with the same name which differ from each other in terms of the type of the input and the type of the output of the...

    , so that all calling code can be interfaced to use this generic code directly, rather than mutating the original. As additional functionality is required, the library is extended (while retaining backward compatibility
    Backward compatibility
    In the context of telecommunications and computing, a device or technology is said to be backward or downward compatible if it can work with input generated by an older device...

    ). This way, if the original algorithm has a bug to fix or can be improved, all software using it stands to benefit.

As a way of branching code

As a way of spinning-off a new product, copy and paste programming has some advantages. Because the new development initiative does not touch the code of the existing product:
  • There is no need to regression test
    Regression testing
    Regression testing is any type of software testing that seeks to uncover new errors, or regressions, in existing functionality after changes have been made to a system, such as functional enhancements, patches or configuration changes....

     the existing product, saving on QA time associated with the new product launch, and reducing time to market
    Time to market
    In commerce, time to market is the length of time it takes from a product being conceived until its being available for sale. TTM is important in industries where products are outmoded quickly...

    .
  • There is no risk of introduced bugs in the existing product, which might upset the installed user base.


The downsides are:
  • If the new product does not diverge as much as anticipated from the existing product, you can wind up supporting two code bases (at twice the cost) for what is essentially one product. This can lead to expensive refactoring
    Refactoring
    Code refactoring is "disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior", undertaken in order to improve some of the nonfunctional attributes of the software....

     and manual merging down the line.
  • The duplicate code base doubles the time required to implement changes which may be desired across both products; this increases time-to-market for such changes, and may in fact wipe out any time gains achieved by branching the code in the first place.


Similar to above, the alternative to a copy-and-paste approach would be a modularized approach:
  • Start by factoring out code to be shared by both products into libraries.
  • Use those libraries (rather than a second copy of the code base) as the foundation for development of the new product.
  • If an additional third, fourth, or fifth version of the product is envisaged down the line, this approach is far stronger, because the ready-made code libraries dramatically shorten the development life cycle for any additional products after the second.

As an approach to repetitive tasks

  • For repetitive tasks, the copy and paste approach often leads to large methods (a bad code smell
    Code smell
    In computer programming, code smell is any symptom in the source code of a program that possibly indicates a deeper problem.Often the deeper problem hinted by a code smell can be uncovered when the code is subjected to a short feedback cycle where it is refactored in small, controlled steps, and...

    ).
  • Each repetition creates a code duplicate, with all the problems discussed in prior sections, but with a much greater scope. Scores of duplications are common; hundreds are possible. Bug fixes, in particular, become very difficult and costly in such code.
  • Such code also suffers from significant readability issues, due to the difficulty of discerning exactly what differs between each repetition. This has a direct impact on the risks and costs of revising the code.
  • The procedural programming
    Procedural programming
    Procedural programming can sometimes be used as a synonym for imperative programming , but can also refer to a programming paradigm, derived from structured programming, based upon the concept of the procedure call...

     model strongly discourages the copy-and-paste approach to repetitive tasks. Under a procedural model, a preferred approach to repetitive tasks is to create a function or subroutine that performs a single pass through the task; this subroutine is then called by the parent routine, either repetitively or better yet, with some form of looping structure. Such code is termed "well decomposed", and is recommended as being easier to read and more readily extensible.
  • The general rule of thumb
    Rule of thumb
    A rule of thumb is a principle with broad application that is not intended to be strictly accurate or reliable for every situation. It is an easily learned and easily applied procedure for approximately calculating or recalling some value, or for making some determination...

     applicable to this case is "don't repeat yourself
    Don't repeat yourself
    In software engineering, Don't Repeat Yourself is a principle of software development aimed at reducing repetition of information of all kinds, especially useful in multi-tier architectures...

    ".
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK