Comparison of languages in quantitative analytics

When it comes to learning a language which takes days and weeks, you will be careful of which language to use for the job scope. This will affect what becomes the most familiar language on your resume when searching for jobs, and affects your chances of employability in the industry you are interested.

If its prototyping or testing of hypothesis, where maybe 90% of strategies are rejected, a less verbose language is preferred. Type reference is implicit, functions are That leaves us with R or python or scala.

However, some would argue that any strategies developed in R cannot be less superficial because R running on a local machine, is default to run with 25% processing power of one core and instantly creates objects in environments, regardless of whether it will be used later. This exhausts local memory, which results in some users experiencing a crash in R application when they try to assign a large data to a variable. That said, a strategy created with small data sets are argued to have no alpha. But, still many continue to use R because it has proper documentations on open source packages contributed by a reliable community of academicians (most of whom are phD, master post graduates doing their coursework) Furthermore, R is attracting more of such talent, some of whom are practitioners working and generously sharing their hardwork to the  community. QuantStrat is one example. It provides the fastest way to test out a strategy, but anyone who traded systematic strategies live know that every tick and millisecond counts, and I know how big the difference is between backtesting and forward testing in real market conditions. That said, if we were to use every trade executed and order sent, regardless of time which compresses many points into one period, the data will be huge. When someone invests a large amount of money, will they be able to trust the backtest results from R? To conclude, R is more suitable for EOD analysis rather than real time walk forward analysis, where speed is of concern. Using tools outside R is inevitable.

Python, being open sourced and having well organised distributors for packages (like R CRAN) such as Anaconda that provides sandboxed environment which allows for version control, and a large lively and supporting community maintaining and upgrading packages with important use, is no wonder attracting more advocators. It supports object-oriented, procedural and functional programming style, giving users the best of all worlds. It can use lambdas to pass expressions as arguments of functions. It can provide a secure data encapsulation security layer by giving developer the rights to give permissions only to specified modifiers and accessors with class permissions, and that allows reuse of a parent common behaviour and attributes and the ability to override with inheritance, and keeping the concepts clear with interfaces with explicit types reference and methods name. It can create iterators and generators that will not exhaust memory that only uses memory on runtime. It have web frameworks (Flask) that allows python to cover the job scope of most developers. It can also be compiled in C with Cpython interpreter. Such extensions gives it a edge over many languages.

If its for production and real time use with many other programs, a lower level language have to be used. For 2 good reasons: One, it will be widely compatible since its a low level with APIs to connect to many other software and languages, and two, it have better memory management where it does not create any complex data structures in the heap with memory address pointers at the stack i.e. JIT compilation. Some examples are C or C++ for performance.

After all, most prototypers will not care more of these computer organisations principles, but wants a simple and quick way to get the job done – wants documented codes, codes that are widely available in the job market for continuity, wants ease without facing compiling errors for being not explicit/verbose enough. Meanwhile, newer languages are being created (e.g. Scalable Language) – not really created, just evolved like versioning control, where it adapts the best of all worlds, and create more flexibility such as allowing both implicit and explicit type referencing, immutability of objects with pointers and duplication so that its suitable for distributed computing map reduce framework where data are splits are fed in small increments before being aggregated in reduce stage.

FYI, Why C#.NET is preferred over other C, C++?

Let’s start with the common points: all three of them (four of them, if we add Java) use the same principles in syntax: the C way.

C was first, and a history of C will make you understand why C is important. C, however, in its goal to become a portable assembler (a common sense common subset) had other priorities, and wanted to be as close to the metal as possible.

C++ includes all that C has, and adds to that encapsulation, polymorphism, templates and a whole standard library based on templates. Using data containers becomes much easier. The differences between C and C++ are subtle – where you could just assign a void * to a char *, for example, you have to do a proper cast; type checking is more strict in C++. But other than that, C++ is just a cleaner version of C, with OOP and metaprogramming added.

C# and Java discard C for C++. They take C++, throw away the pointer notation, and all variables become hidden pointers (except for the value types, primitive types, due to performance reasons). They add forcibly garbage collection, metadata to your classes, all the objects will be derived from a base class, called object or Object, which adds automatically virtual methods to objects, and they never compile to native code; instead they compile to an evolved machine language called IL for C# and bytecode for Java. This makes them require an interpreter to run said code, and transform it in native code, or just interpret the code and run it like a Virtual Machine. Since probably that was the initial approach, Java’s interpreter is called JVM (Java Virtual Machine), while C#’s is called CLR (Common Language Runtime). But in the end, to obtain performance, the interpreters try to actually generate native code, so they all come with a translator of intermediary code to native code. This is the so-called JIT (Just In Time) compilation.

But the interpreted code is a performance issue, and the supporters for the languages create new ways to improve said performance. For example, Microsoft chose to actually run the JIT before even loading some modules, only one time, so when the .NET framework gets an update, you’ll see some processes busy for a long time: they will transform your common language code to native code before you even use them, and store a cached version to be available for all programs.

The difference between C#/Java and C is too big, but the differences between C#/Java and C++ are easier to pick and the most important, other than said updates to the language, are the adoption of a pure OOP approach to programming. C# does it less than Java, but Java has these purists that for a lot of time refused to do things non-OOP (and that is usually bad, because OOP is just one paradigm, but it cannot cover everything without huge performance issues).

C# and Java are also ‘owned’ languages. If you choose C#, you are tied to Microsoft products (although Mono lives on a developers promise from Microsoft and the fact that C# is defined as an ECMA standard). Microsoft drives the language, as well as the .NET framework that actually gives value to C# as a language.

Java used to be owned by Sun, which now has been taken over by Oracle. I won’t say more, but as we speak, Oracle has some long standing lawsuit against Google for using Java ‘not how we wanted you to use it’. So there’s a word of warning there, as well as the fact that Oracle won’t be able to drive a language the same way Microsoft does. And C# is really ages ahead of Java.

In the mean time, C++ has this C++11 extension. This extension, and the coming C++14, aims to improve the language, to make it more of a modern language. You can see new smart-pointers, lambdas support, ranged loops and all sort of improvements for the developer that also keep the overhead of not using them to 0. C++ always had this don’t pay for what you don’t use approach, that makes it a more mature language.

Beyond this, it’s all just a big, big flame war.

FYI, what’s high,intermediate, low level languages and their differences?

Matt Jones, CTO of Plum Voice

It helps to think of programming languages in terms of how closely they are able to control what a computer is actually doing.  These are commonly classified as low-, intermediate- and high-level programming languages.  Computers execute machine code, and assembly language is considered the lowest level programming language; it is a human readable version of machine code.

C was created to provide a structural programming language that is easier to use than assembly.  It is considered a low-level programming language with little to no loss in performance relative to assembly.  This made C the natural choice for building operating systems and low-level software on computers because it allowed for easier development at near-assembly performance.

C++ is essentially an extension of C. The original C++ compilers just pre-compiled directly into C, which was then compiled to machine code, while modern C++ compilers can easily compile C or C++ into machine code.  C++ was designed to allow developers to use all of the existing features of C but provides a number of extensions to support object-oriented programming techniques in an intermediate-level programming language.

C# is a complete outlier in this list.  Despite it’s name, it has far more in common with Java than C or C++.  C# is an object-oriented, high-level programming language.  Like Java, C# provides a number of features to make it easier for a developer to code in this language such as type checking, bounds checking, uninitialized variable checking, and garbage collection.  While the language does not technically specify how it is executed, C# is most commonly compiled into byte-code (rather than machine code) and executes on a virtual machine (like Java) that converts the application into machine code on the fly.

Developers who are focused on performance still pick C or C++ as their language of choice.  Nearly all operating systems (kernel and low-level system software) are written in C, C++ or some combination of the two.  Most high-profile server and desktop software is also written in C++.  For example, most web browsers, office suites and games are written in C or C++.  C# remains a common choice for internal/enterprise applications but is less common for commercial software.

This is a useful reference for the languages used to develop modern high-profile software: The Programming Languages Beacon

Shehala It, Shehala
Both C and C++ give you a lower level of abstraction that, with  increased complexity, provides a breadth of access to underlying machine  functionality that are not necessarily exposed with other languages.   C++ adds the convenience (reduced development time) of a fully object  oriented language which can, potentially, add an additional performance  cost.  In terms of real world applications, I see these languages  applied in the following domains:
C

  • Kernel level software.
  • Hardware device drivers
  • Applications where access to old, stable code is required.

C,C++

  • Application or Server development where memory management needs to  be fine tuned (and can’t be left to generic garbage collection  solutions).
  • Development environments that require access to libraries that do not interface well with more modern managed languages.
  • Although managed C++ can be used to access the .NET framework, it is not a seamless transition.

C# provides a managed memory model that adds a higher level of  abstraction again.  This level of abstraction adds convenience and  improves development times, but complicates access to lower level APIs  and makes specialized performance requirements problematic.
It is certainly possible to implement extremely high performance  software in a managed memory environment, but awareness of the  implications is essential.
The syntax of C# is certainly less demanding (and error prone) than  C/C++ and has, for the initiated programmer, a shallower learning curve.
C#

  • Rapid client application development.
  • High performance Server development (StackOverflow for example) that benefits from the .NET framework.
  • Applications that require the benefits of the .NET framework in the language it was designed for.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s